Description
The n2 project is a reimplementation of the ninja build system. As such, it launches many subprocesses. For every subprocess, it spawns a thread that runs std::process::Command::spawn()
.
On macOS, ninja -j250
runs fine, while n2 -j250
runs out of file descriptors (n2 bug report: evmar/n2#14). It looks like this is due to an FD leak in rust's standard library.
Command::spawn() in the rust stdlib unconditionally calls anon_pipe here:
anon_pipe on Linux calls pipe2 to set CLOEXEC on the pipe atomically:
rust/library/std/src/sys/unix/pipe.rs
Line 18 in 5217347
But macOS has no pipe2, so here the stdlib instead calls pipe() followed by set_cloexec:
rust/library/std/src/sys/unix/pipe.rs
Line 35 in 5217347
This means there's a window where the pipe is created but cloexec isn't set on the pipe's FDs yet. If a different thread forks in that window, the pipe's fds get leaked.
The FD leak went away when putting std::process::Command::spawn()
behind a mutex, so it does seem like this race is in fact the cause.
On the n2 issue, @bnoordhuis remarks "Just throwing it out there: libstd uses posix_spawn() under the right conditions (instead of fork + execve) and macOS has a POSIX_SPAWN_CLOEXEC_DEFAULT attribute that does what its name suggests. Teaching libstd about it probably isn't too hard." This might be a possible venue for a fix on macOS, but it's possible to imagine a program that depends on some FDs staying open, and I don't know if there's a way to make POSIX_SPAWN_CLOEXEC_DEFAULT apply only to the 2 fds returned by pipe().