Expect scripts, [exec], and SIGCHLD

I had to debug an expect script that would not run in a specific environment today, and the solution was so frustrating that I had to write it down.

In short, expect’s [exec] functionality requires SIGCHLD to be functional to work correctly. If you have ignored the signal, you get a very nice explanation of why your script has just died. For instance:

lappy 0 /home/jetmore/swap > cat ./run-expect
puts "[exec date]: Expect: This is stdout"
lappy 0 /home/jetmore/swap > ./run-expect
Wed Mar 28 13:16:44 CDT 2012
error waiting for process to exit: child process lost (is SIGCHLD ignored or trapped?)
    while executing
"exec date"
    invoked from within
"puts "[exec date]: Expect: This is stdout""
    (file "./run-expect" line 3)

This is great if you’re actually seeing the script’s STDERR. As it happened I was presented with this problem in an environment in which,¬†coincidentally, the tool executing the expect script was also throwing away STDERR. Any attempt I made to capture STDERR or STDOUT made the script start working. A heisenbug. I finally saw the error above by attaching the the expect script using strace -e write=2 -p $PID

After confirming that the calling tool was indeed setting SIGCHLD to IGNORE and that manually ignoring SIGCHLD broke the script (as in the example above), I found that the following script (which sets SIGCHLD back the the system’s default behavior) would survive even if the caller had IGNOREd SIGCHLD:

puts "[exec date]: Expect: This is stdout"

So, the hopefully-googleable moral of the story is that expect’s exec function requires that SIGCHLD not be set to ignore, and this should be checked if your expect script is working in one environment but not another (ie, working from command line but failing when called by another program).

Leave a Reply

Your email address will not be published. Required fields are marked *