Restarting Services Automatically on Certificate Rotation

reading time ( words)

This post originated on a question from a colleague about how to make your services restart automatically upon automated certificate rotation and replacement.

For starters, my automated certificate management workflow looks like this. For each expiring certificate, a new ACME DNS-01 challenge is initiated and automatically configured This ensures that my Cert Manager server will always have a current certificate for all covered services.

%%{init: { 'theme':'base', 'themeVariables': { 'primaryColor': '#BB2528', 'primaryTextColor': '#000', 'primaryBorderColor': '#7C0000', 'lineColor': '#F8B229', 'secondaryColor': '#006100', 'tertiaryColor': '#000', 'fontFamily': 'Raleway, Verdana, Arial', 'fontSize': '30px' } } }%% sequenceDiagram participant cm as Cert Manager participant dns as DNS Distribution Master participant le as Let's Encrypt participant server as DB / Mail / Web Server alt Certificate Request Cycle for expiring certificates cm->>le: Initiate dns-01 challenge le->>cm: Challenge data cm->>dns: Update records for dns-01 cm->>cm: Verify DNS changes propagated cm->>le: Proceed with request le->>cm: Provide updated cert end alt Certificate Publication cm->>server: Push updated certificates server->>server: Adjust permissions as required server->>server: Restart affected services end

New issued certificates are pushed automatically to each server where it is required. Upon receiving the certificate, its permissions are automatically adjusted to ensure that any and all services that need it, are able to read the cryptographic material.

After all this is done, services are automatically restarted. This is the somewhat tricky part.

An easy restart trigger

Most services require a full restart when replacing the certificates. This ensures that the crypto libraries can re-initialize themselves with the updated key material. This is true for most dæmons and in particular, dovecot, nginx, postfix, postgresql and sendmail which are the ones I use the most.

Fortunately, nowadays all of those processes will handily provide a PID-file for tracking purposes. Conveniently, the timestamp of this file will correspond with the moment where the service started running.

All we need to do, is to compare the timestamp of this PID file with the timestamp of the corresponding certificate file—whenever the certificate file is newer, the service would need to be restarted.

Now, keeping track of which certificate is associated with which service can be cumbersome. To make things easier, all services—except PostgreSQL, this is a special case—get their certificates through a subdirectory of its main configuration directory—e.g.: /etc/nginx/tls, /etc/dovecot/tls and so on—which helps keep the configuration straightforward, and the relationship between certificates and services easy to see.

Restarting nginx

For the nginx case, where many virtual servers, each with its own TLS certificate, depend on a single service. For this case, I use a handy find incantation from my crontab to restart when needed:

# Restart nginx whenever newer certificates are found.

*/15 * * * * root find -L /etc/nginx/tls -name fullchain.pem -newer /run/nginx.pid  -exec systemctl restart nginx \; -quit

So, every 15 minutes find examines the fullchain.pem certificate bundles used anywhere on my nginx configuration, comparing their timestamps with the corresponding PID file. If the PID file is found to be older than any certificate file, the find -exec stanza takes care of restarting nginx.

This arrangement is quite convenient, as my nginx restarts are very quick. The 15 munute check interval also helps for when I intend to deploy a new website quickly—normally, the site is fully up and running with a fresh certificate by the time I finish setting up other things.

Restarting PostgreSQL

On a similar note, I want to restart my database clusters whenever a new certificate is deployed. The automation typically runs at times of little traffic, and in general PostgreSQL is pretty smart about its restarts, so this process tend to work well in my environment.

In this case, the crontab scripting looks inside the postgresql.conf file for each instance to pick up the specific certificate bundle being used. Other than that, the logic is quite the same as for the nginx case. The below is the code straight from my crontab file:

# Try and restart all postgres instances whose certificates were updated, automatically

11 22 * * * root find /etc/postgresql/ -name postgresql.conf -printf '\%P:' -exec egrep '^ssl_cert_file' '{}' \; | while read s; do pginst=$(echo $s | cut -f1-2 -d/ | tr / -); pidfile="/run/postgresql/${pginst}.pid"; cert=$(echo $s | sed 's/^.*= *//' | tr -d "'\"" ); [ "${cert}" -nt "${pidfile}" ] && echo restarting postgresql@${pginst} && systemctl restart "postgresql@${pginst}"; done

Restarting other services

For other services I tend to use a variation of the nginx use case, simply changing the systemctl service name to match.

This scheme has worked wonders over months of unatended operation, so I hope you can use it for your own environment.