I have a cron job that renews an SSL certificate from Let's Encrypt, and then restarts the smtpd daemon so that the new certificate is picked up. This all works fine--as proven by both the presence of a new, valid cert on disk, and smtpd successfully restarting--but cron never sends an email with the output of the job. What gives?

The facts

  • System: OpenBSD
  • Mail software: OpenSMTPD
  • Cron job command (roughly): le_renew_certs.sh && /etc/rc.d/smtpd restart
  • Cron log at the time of the job: Feb 2 00:17:11 puffy cron[37026]: (root) MAIL (mailed 563 bytes of output but got status 0x0001)

The investigation

For brevity, I'm going to omit the investigation I did into whether the le_renew_certs.sh script was the source of the issue. You'll have to take me at my word that it wasn't. Also, the strange error message from cron suggests the issue is somehow related to cron's interaction with the mail software.

man cron? No help.

Googling? No help. Wasn't even sure how to begin searching.

Clues in /var/log/maillog? Nope.

What's left? How about ktrace(1)?

ktrace enables kernel trace logging for the specified processes. [...] The kernel operations traced are system calls, namei translations, signal processing and I/O. --ktrace(1) man page

Perfect. The equivalent of using a network sniffer on unix processes.

Before enabling tracing, I modified crontab(5) to start the job in the next minute instead of at its regular time.

In order to trace everything related to this cron job, I would need to enable tracing not just of the cron process but also child processes (-i). The full command I used is:

ktrace -f /tmp/cron.ktrace -ip `pgrep cron`

After the cron job ran (and the error was logged to the cron log), I disabled all tracing with this command:

ktrace -C

The dump file that ktrace(1) outputs is not human-readable; kdump(1) is used to decode and display it. I used this command to view the trace file:

kdump -TRf /tmp/cron.ktrace | less -N

The /tmp/cron.ktrace file was almost 4MB in size. I knew the rough order of operations for this simple cron job, so I started wading through the trace file, keeping an eye open for any deviations from what I expected. The highlight events I saw in the file were:

  • Cron starts processing the job.
  • Cron forks /usr/sbin/sendmail (/usr/sbin/sendmail -fCronDaemon -odi -oem -oi -t); the sendmail command blocks, waiting for input.
  • In the case of OpenSMTPD, sendmail is a wrapper for smtpctl. When called as sendmail, smtpctl opens a unix socket (/var/run/smtpd.sock) which it uses to talk to the running smtpd process.
  • The cron job commands are run; certs are renewed and smtpd is restarted. Both of these commands output status messages as they run.
  • smtpd starts up, forks its child processes, and continues to initialize.
  • The cron job finishes.
  • Cron writes the output from the job to the sendmail process and closes the pipe to sendmail (which is really smtpctl).
  • smtpctl attempts to write the cron output to the unix socket as a new email and barfs with write error: Broken pipe

The root cause

What's not present in the trace file is what happens when the currently-running smtpd processes are killed. Among a bunch of other things, the /var/run/smtpd.sock file is removed. This means what's happening is:

  1. smtpctl opens /var/run/smtpd.sock and blocks, waiting on input
  2. smtpd is killed and deletes /var/run/smtpd.sock
  3. smtpd is started and creates a new /var/run/smtpd.sock
  4. smtpctl tries to write to the old /var/run/smtpd.sock
  5. 🤮

Looking through the cron(8) code, I see that it actually forks a couple of times when running a job: the grandchild runs the commands for the job, and the child process forks sendmail and writes the job's output there. The child forks sendmail (i.e., smtpctl) as soon as it reads some output from the grandchild (in other words, as soon as the job issues some output).

The result

In the end, I don't see a way to have cron email the output of a job that restarts smtpd unless, perhaps, the restart of smtpd is done silently and prior to any other output being emitted by the job.

A pattern for a cron job command which does work is this:

le_renew_certs.sh && /etc/rc.d/smtpd restart | mail -s "Restarting smtpd" joel

This directs all output from the job to the mail(1) command and sidesteps cron's handling of job output.