devops.
devops5 min read

Auto-Renew Let's Encrypt Certs with certbot.timer and Nginx

Wire up certbot.timer for Let's Encrypt auto-renewal on Nginx — systemd unit, post-renew reload hook, and --expand for adding SANs without downtime.

letsencrypt certbot renewcertbot timernginx letsencryptcertbot expandpost-renew hook

Auto-Renew Let's Encrypt Certs with certbot.timer and Nginx

Most "renew your cert" guides stop at certbot renew and assume cron picks up the slack. On a single-server polyglot stack — Nginx in front, a few backend services behind — that's the wrong abstraction. Cron has no concept of randomized delays, no structured logs, no failure isolation. systemd's certbot.timer does, and it's installed by default on Debian/Ubuntu when you apt install certbot. The work isn't writing a renewal script. It's understanding what the timer already does, plugging in a reload hook so Nginx picks up the new cert without a restart, and using --expand when you add a subdomain to a SAN cert.

What certbot.timer actually does

When you install certbot via the Debian/Ubuntu package or the snap, you get certbot.timer and certbot.service for free. Check it:

systemctl list-timers certbot.timer
systemctl cat certbot.timer

The timer fires twice a day with RandomizedDelaySec=43200 (12 hours), which means each run is jittered across that window. That's deliberate — Let's Encrypt doesn't want every server in the world hitting acme-v02.api.letsencrypt.org at midnight UTC. The service it triggers runs certbot -q renew, which iterates every cert in /etc/letsencrypt/renewal/ and renews any with ≤30 days left. Anything outside that window is a no-op.

This is why the cron-based approach feels redundant on modern boxes. If you've ever copied 0 0,12 * * * certbot renew from a 2016 blog post into your crontab, delete it. You'll get duplicate runs and double the API hits, and the timer's jitter is more polite to the upstream API.

Verify the timer is enabled

sudo systemctl enable --now certbot.timer
systemctl status certbot.timer

The status output should show Active: active (waiting) and a Trigger: line pointing at the next scheduled run. If enable returned "already enabled", you're done with this step.

To dry-run the renewal logic without hitting the real ACME API:

sudo certbot renew --dry-run

This stages a renewal against Let's Encrypt's staging endpoint, which has much higher rate limits than production. Use it after every config change. A successful dry-run is the only reliable signal that the next real renewal won't fail at 3am.

The reload hook problem

Renewing a cert writes new files to /etc/letsencrypt/live/<domain>/. Nginx, however, opens cert files at startup and holds the file descriptors. New file on disk, old file descriptor in memory — your visitors keep getting the old cert until Nginx reloads.

certbot solves this with --deploy-hook, which runs ONLY when a renewal actually succeeds (not on every timer fire). You can register it globally so every renewal benefits:

sudo mkdir -p /etc/letsencrypt/renewal-hooks/deploy
sudo tee /etc/letsencrypt/renewal-hooks/deploy/nginx-reload.sh > /dev/null <<'EOF'
#!/bin/sh
nginx -t && systemctl reload nginx
EOF
sudo chmod +x /etc/letsencrypt/renewal-hooks/deploy/nginx-reload.sh

The nginx -t guard matters. If a config file is broken — typo'd directive, missing include — the reload skips and Nginx keeps running on the old cert. Without the guard, systemctl reload nginx on a broken config is a no-op too, but you lose the explicit error in the journal. With it, you'll see the failure in journalctl -u certbot.service and know to fix the config before the next renewal cycle.

--deploy-hook over --post-hook: the post-hook runs after every renewal attempt, including no-ops. The deploy-hook runs only when a cert was actually written. On a server with 8 certs renewing once every 60 days, that's 8 reloads a year vs 730 — the deploy-hook is roughly 90× quieter and never reloads Nginx for nothing.

Adding a subdomain — use --expand, not a new cert

A common mistake is treating each subdomain as its own cert. If you already have a cert covering example.com and www.example.com, and you spin up api.example.com, the temptation is certbot --nginx -d api.example.com. That works, but you now have two certs to renew, two private keys, two reload triggers.

--expand is better:

sudo certbot --nginx \
  -d example.com \
  -d www.example.com \
  -d api.example.com \
  --expand

--expand tells certbot "this domain list is the new SAN list for the existing cert" and reissues a single cert covering all three. The renewal config in /etc/letsencrypt/renewal/example.com.conf updates in place. One cert, one timer, one reload.

The trade-off: SAN certs are visible to anyone who inspects your TLS handshake. openssl s_client -connect example.com:443 </dev/null | openssl x509 -noout -text reveals every domain in the SAN list, which is fine for public services and bad if you want one subdomain to stay unadvertised. For that case, issue a separate cert. For 95% of single-server stacks, --expand is the right default.

Verify the expanded cert took effect:

sudo certbot certificates

Look for the cert covering all three domains, and confirm Expiry Date: is roughly 90 days out.

Logging and failure detection

certbot writes detailed logs to /var/log/letsencrypt/letsencrypt.log. The journal entry from certbot.service is a one-liner; the real diagnostic is the file. When a renewal fails — usually because Nginx is serving the wrong vhost for the ACME challenge, or because the rate limit hit (50 certs per registered domain per week, per the Let's Encrypt rate-limits doc) — the log explains why.

For unattended monitoring, hook the failure case:

sudo tee /etc/letsencrypt/renewal-hooks/deploy/notify-success.sh > /dev/null <<'EOF'
#!/bin/sh
logger -t certbot "renewed cert for $RENEWED_DOMAINS"
EOF
sudo chmod +x /etc/letsencrypt/renewal-hooks/deploy/notify-success.sh

$RENEWED_DOMAINS is a space-separated list of every domain renewed in this run. Pipe it to a Slack webhook, a Telegram bot, or just logger so it shows up in journalctl -t certbot. Failure detection is then "no logger entry for 60 days = something broke" — set a calendar reminder or a Prometheus alert on the cert expiry metric (the node_exporter textfile collector reads cert expiry trivially).

When the timer doesn't fire

On a fresh VM, occasionally certbot.timer is masked or the unit file got mangled by a partial install. The fix:

sudo systemctl unmask certbot.timer
sudo systemctl daemon-reload
sudo systemctl enable --now certbot.timer

If the cert is already inside the 30-day window when you set this up, the next timer fire renews it. Don't certbot renew --force-renewal unless you're testing — that bypasses the 30-day check and counts against your weekly rate limit.

A dry-run + enabled timer + deploy-hook is the minimum-viable auto-renew setup for a single-server Nginx box. Once that's in place, you can ignore certs until you add a subdomain (use --expand) or rebuild the box (back up /etc/letsencrypt/ and restore it).

References:

  • https://eff-certbot.readthedocs.io/en/stable/using.html
  • https://letsencrypt.org/docs/rate-limits/
  • https://github.com/certbot/certbot