I was recently writing a small script that would launch the spectacular mitmdump while simultaneously performing application testing using watir-webdriver. To make this work, I needed a dead simple way to launch mitmdump as a concurrent background process that could easily be closed by sending a SIGINT once the script was done performing its tests. I didn't need access to stdin, stdout, or stderr and I didn't need to know if mitmdump ran successfully.
Anyway, sounds like a simple enough task, right? Especially when you consider that Ruby has no shortage of ways to execute external programs--the best known examples including Kernel#exec, Kernel#system, IO#popen, Open3#popen3, and the simple backtick--but how to choose?
IO#popen & Open3#popen3
With IO#popen and Open3#popen3, the external program is run as a subprocess and its standard input and output are connected to the returned IO object. Open3#popen3 has the added benefit of allowing you to independently capture standard error; however, you can still obtain stderr using IO#popen by redirecting the stderr stream to the stdout stream with 2>&1. While there are all sorts of benefits to using these methods, I simply didn't need the level of control they provide. Instead, I just needed something fast and functional.
Backticks
Backticks are arguably the simplest and best known way to execute an external program or shell script. When using backticks, the completed subprocess returns the actual stdout of the executed program. If needed, you can obtain the exit status code and process id of the program using $? variable. It is worth noting that you do not have real-time access to a program run in this way. Output is only returned when the process finishes. This made backticks a non-starter because I couldn't run mitmdump concurrently with my watir-webdriver test scripts.
Kernel#exec & Kernel#system
Kernel#exec replaces the current process by invoking the given external program or shell command. This means that any additional Ruby after the call to Kernel#exec will not be executed. This is a very simple and non-interactive invocation that does not provide access to stdin, stdout, stderr, or the process id. Kernel#system is similar, but has a few small differences. Unlike Kernel#exec, Kernel#system runs in a subshell and does not replace the current process. Kernel#system also allows you to obtain the exit status code using the $? variable.
For my purposes, Kernel#exec and Kernel#system seemed like the simplest choices, but both have one obvious drawback: It gets a bit tricky sending a SIGINT to a process when you don't know the process id. So, that lead to the final problem:
How do you obtain the pid for a process executed using Kernel#exec or Kernel#system?
The solution is relatively simple, but not something I've seen widely documented online. The trick is to use Process#fork to initiate a subprocess that then executes the external program or shell script using Kernel#system or Kernel#exec. I opted to use Kernel#exec because it simply takes over the forked subprocess whereas Kernel#system would launch an additional subprocess. This means that you can simply assign the forked subprocess to a variable and then exit the external program or shell script by sending it a SIGINT using Process#kill. Sample code below:
# Launches mitmdump in a forked subprocess
pid = fork do
exec("mitmdump -w #{stream_file}")
end
# Additional Ruby code goes here
# Send SIGINT to kill the mitmproxy subprocess
Process.kill("INT", pid)
If you read this far, you should follow me on twitter here.