docker-compose-log-rotation-disk-fill

The midnight pager that always wins

Let me tell you about the villain every operations team eventually meets. A service that's been humming for weeks suddenly refuses to restart. Monitoring screams about a disk at 100 percent. The culprit turns out to be a single forgotten log file living deep inside the Docker data directory.

The container that crashed wasn't the cause. It was the victim of a much quieter offender: the default json-file log driver has no rotation policy until you tell it otherwise, and docker compose doesn't opt you into one for free.

This is an intro-level walkthrough of why disk-fill from container logs keeps happening in compose-based stacks, where the bytes actually live, and how to put a rotation policy in place that survives a docker compose down && up -d. I'm not going to build a working repository here. I'll sketch the failure mode, point at the official references that explain the knobs, and leave you ready to add a few lines to your compose.yaml the next time you touch it.

Where do the bytes live, exactly

Which file on your host actually grows when a containerized service writes a log line? It's not where most operators guess, and it's not owned by your application. When you run docker compose up, each service spawns a container whose stdout and stderr are captured by the Docker daemon and written to a JSON-lines file on the host. By default, on Linux, that file lives under /var/lib/docker/containers/<container-id>/<container-id>-json.log. Each line is a JSON object containing the log stream name, the timestamp, and the raw line your application emitted.

There are two properties of this file that operators routinely forget. First, the file is owned by the Docker daemon, not by your service, so a container restart doesn't truncate it. The file persists across restarts and only goes away when the container itself is removed. Second, the file grows monotonically. Until you configure a log driver option that says otherwise, the daemon never rotates this file, never compresses it, and never asks the kernel to discard old contents. A noisy debug log from a single service in a single container can put a 50 GB file on your host within a week.

Compose stacks make the problem worse in two subtle ways. Compose typically runs multiple services on the same host, so the noisy neighbour pattern is amplified — one chatty service can fill the disk and take every other service in the project down with it. Compose also encourages long-lived containers via the restart: unless-stopped policy, which means the JSON log files accumulate over the entire lifetime of the deployment rather than getting reset on routine deploys.

Why the default surprises people

Docker's default log driver has rotation knobs and leaves them off, a quirk plainly noted in Docker's json-file logging driver page. The two relevant keys are max-size, which caps an individual log file, and max-file, which caps how many rotated files the driver keeps. If you don't set either, you get a single unbounded file per container.

The reason the defaults are off is partly historical. The json-file driver was the first thing Docker shipped, and at the time the implicit contract was that log shipping would be a sidecar concern handled by a host-level agent that tailed the file. In practice, most teams that adopted Docker via docker compose never deployed such an agent, and the rotation gap became operations debt that surfaces only when a disk fills.

A second source of surprise is more subtle. The daemon-wide configuration in /etc/docker/daemon.json and the per-service configuration in compose.yaml look almost identical and use the same option names, but they apply at different scopes and they don't merge cleanly. If you set rotation in the daemon config and your compose.yaml declares any logging.options, the compose-level options win for that service and the daemon defaults are ignored entirely. Operators routinely change the daemon config, restart the daemon, fail to verify per-service overrides, and conclude that rotation is broken when it's in fact being silently overridden.

The minimal compose snippet that fixes one service

A friend once added six lines of YAML to her compose file and watched a runaway log file stop growing within the hour. Drop this block into the service definition in your compose.yaml:

services:
  api:
    image: ghcr.io/example/api:1.2.3
    restart: unless-stopped
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

The semantics are precise. max-size: "10m" tells the json-file driver to rotate the active log file once it reaches ten megabytes. max-file: "3" tells the driver to keep at most three files total: the active file plus two rotated archives. The driver names the rotated files with a numeric suffix, so once a fourth rotation occurs the oldest file is deleted. Worst-case disk usage for this service becomes roughly thirty megabytes plus whatever the active file has buffered since the last rotation.

The exact key names and accepted units are documented in the Compose specification under the logging section of the services reference. The size suffix accepts k, m, and g, which map to the kilobyte, megabyte, and gigabyte interpretations used elsewhere in the Docker tooling. Quoting the values as strings is recommended because the Compose spec treats max-size as a string and max-file as a string, even when the value looks like a number to YAML.

Caveats that bite even after you have the snippet

The snippet above is necessary but not sufficient. There are at least four follow-on details that catch operators who think they're done after editing the compose file.

The first is that compose-level logging options only apply to containers that the Compose project creates after you change the file. Existing containers continue to use whatever driver options they were created with. To pick up the new policy you must recreate the containers, which means a docker compose up -d --force-recreate for the affected service, not just a restart. A restart cycles the process inside the existing container; only a recreate reads the updated logging block.

The second caveat is that the rotation only governs the active container's log file. If you have a pile of old container directories left behind by previous deploys, the new policy doesn't retroactively prune them. You'll still need a one-time docker system prune or a more surgical docker container prune to reclaim disk space from containers that have already exited. Until those exited containers are pruned, their log files persist exactly as large as they were when the container died.

The third caveat concerns drivers other than json-file. The max-size and max-file options are only honored by the json-file driver and the newer local driver. If you set a logging.driver of journald, syslog, fluentd, gelf, awslogs, or anything else, the rotation options are silently ignored, and rotation becomes a property of the downstream log system. This is a common foot-gun when teams add a logging stack like Loki or a cloud aggregator midway through a deployment's life and forget that the local rotation policy has stopped applying.

The fourth caveat is that the local driver, which Docker recommends in newer documentation for hosts that don't need the JSON file format, has its own defaults that are different from the json-file driver. The local driver enables compression and rotation automatically with a default max size of 20 megabytes per file and 5 files retained. If you switch drivers without changing the option block you may end up with a different ceiling than you intended. The behavior is documented in the Docker logging drivers reference and is worth reading carefully before you flip a service from json-file to local.

What a daemon-level default looks like

For a fleet of hosts running many compose projects, setting the policy in each compose.yaml is repetitive and fragile. The alternative is to set a fleet-wide default in /etc/docker/daemon.json so that any container created on the host inherits the rotation policy unless it explicitly overrides it. A daemon config that pins rotation looks like a few keys at the top level of the JSON document: a log-driver key naming the driver, and a log-opts map containing the same options that would appear under logging.options in compose. After editing the file you restart the Docker daemon and the new defaults take effect for any container created after the restart.

The trade-off is that daemon-level defaults are easy to drift away from. A new engineer adds a logging: block to a single compose service to enable a different driver, the override silently disables the rotation policy for that service, and the host slowly accumulates an unbounded log file again. Most teams settle on a mixed approach: daemon-level defaults that are tight enough to keep the disk safe even when individual services misbehave, plus a convention that every compose file declares its logging block explicitly so that nothing is implicit.

Confirming the policy is actually in effect

The fastest way to verify that a running container has the rotation policy you expect is to inspect it directly. The docker inspect command, pointed at a container ID, will print a JSON document that includes a HostConfig.LogConfig block showing the driver name and the options that were actually applied at create time. If the block shows your max-size and max-file, the container is bounded. If the block shows an empty options map, the container was created before your change took effect and you need to recreate it. The docker inspect output is the authoritative source — what's in the compose.yaml on disk is just a wish until you act on it.

Once the policy is in place, you can watch the on-disk files to see the rotation working. List the files under /var/lib/docker/containers/<id>/ and you should see the active log file plus up to max-file - 1 rotated copies with a .1, .2, and so on suffix. If you only ever see the active file, either the application hasn't produced enough output to trigger a rotation or the policy isn't being honored. Generate a burst of output deliberately, then list the directory again to confirm.

Beyond rotation: thinking about the log pipeline

Log rotation is a containment measure, not a logging strategy. Once you have a rotation policy in place that prevents a single chatty service from filling the host disk, the next question is whether you want those logs to live only on the host or whether you want them shipped somewhere durable. Rotation by definition discards old data, and if the developer trying to debug a production incident wants to see what happened yesterday, the rotated and deleted file is gone.

The healthier pattern for any production deployment is to combine local rotation with a forwarding agent. The forwarder reads from the JSON files or attaches directly to the container's stdout, pushes the lines into a central log store, and lets you keep weeks of history in a system that can handle the volume. Once the forwarder is in place, you can tighten the local max-size and max-file significantly, because the local file is only a buffer for the forwarder, not the system of record.

The exact forwarder choice is a separate decision: Promtail with Loki, Fluent Bit with any backend, Vector, or a cloud-native sidecar all work. The decision tree for picking one is too long for an intro article. What matters at this stage is the recognition that rotation alone is a stopgap and that you should plan for the next step before the next outage forces you to take it under pressure.

What to do this afternoon

If you came here because a host just filled up or because you noticed an unbounded log file on a development laptop, the immediate action is to add the logging block to every service in every compose file you control, recreate the containers, prune the exited ones, and verify the policy with docker inspect. That sequence takes less than thirty minutes for most projects and removes an entire class of pages from your on-call rotation. Save the deeper questions about a forwarder, a central store, and a daemon-level default for the next week — but get the host out of danger first.

The official references to keep open while you do this work are the Docker documentation pages linked above. They're short, accurate, and updated alongside the engine itself, so they're the right source of truth when an option behaves differently from what an older blog post claimed.