Skip to content

libnative does not handle fds with O_NONBLOCK #13336

Closed
@lilyball

Description

@lilyball
Contributor

libnative does not detect when a read/write fails due to O_NONBLOCK being set on the fd. It makes the assumption that all of its files never have that flag set, because it never sets that flag on them. Unfortunately, this isn't necessarily the case. FIFOs and character device files (e.g. terminals) will actually share the O_NONBLOCK flag among all processes that have the same open file description (e.g. the underlying kernel object that backs the fd).

Using a tiny C program that uses fcntl() to set O_NONBLOCK on its stdout, a FIFO, and a Rust program that writes 32k of output to stdout, I can reproduce this issue 100% of the time. The invocation looks like

> (./mknblock; ./rust_program) > fifo

and on the reading side I just do

> (sleep 1; cat) < fifo

This causes the rust program to return a "Resource temporarily unavailable" error from stdout.write() after writing 8k of output. Removing the call to ./mknblock restores the expected behavior where the rust program will block until the reading side has started consuming input. And further, switching the rust program over to libgreen also causes it to block even with ./mknblock.


The C program looks like this:

#include <fcntl.h>
#include <stdio.h>

int main() {
    if (fcntl(1, F_SETFL, O_NONBLOCK) == -1) {
        perror("fcntl");
        return 1;
    }
    return 0;
}

The Rust program is a bit longer, mostly because it prints out information about stdout before it begins writing. It looks like this:

extern crate green;
extern crate rustuv;

use std::io;
use std::io::IoResult;
use std::libc;
use std::os;
use std::mem;

static O_NONBLOCK: libc::c_int = 0x0004;
static O_APPEND: libc::c_int = 0x0008;
static O_ASYNC: libc::c_int = 0x0040;

static F_GETFL: libc::c_int = 3;

unsafe fn print_flags(fd: libc::c_int) -> IoResult<()> {
    let mut stat: libc::stat = mem::uninit();
    if libc::fstat(fd, &mut stat) < 0 {
        try!(writeln!(&mut io::stderr(), "fstat: {}", os::last_os_error()));
        libc::exit(1);
    }

    try!(writeln!(&mut io::stderr(), "stdout: dev={}, ino={}", stat.st_dev, stat.st_ino));

    let flags = libc::fcntl(fd, F_GETFL);
    if flags == -1 {
        try!(writeln!(&mut io::stderr(), "fcntl: {}", os::last_os_error()));
        libc::exit(1);
    }

    let mut v = Vec::new();
    if flags & O_NONBLOCK != 0 {
        v.push("nonblock");
    }
    if flags & O_APPEND != 0 {
        v.push("append");
    }
    if flags & O_ASYNC != 0 {
        v.push("async");
    }

    try!(writeln!(&mut io::stderr(), "flags: {}", v.connect(", ")));
    Ok(())
}

fn run() -> IoResult<()> {
    unsafe { try!(print_flags(1)); }

    let mut out = io::stdio::stdout_raw();
    for i in range(0u, 32) {
        try!(writeln!(&mut io::stderr(), "Writing chunk {}...", i));
        let mut buf = ['x' as u8, ..1024];
        buf[1023] = '\n' as u8;
        match out.write(buf) {
            Ok(()) => (),
            Err(e) => {
                try!(writeln!(&mut io::stderr(), "Error writing chunk {}", i));
                return Err(e);
            }
        }
    }
    Ok(())
}

fn main() {
    match run() {
        Err(e) => {
            (writeln!(&mut io::stderr(), "Error: {}", e)).unwrap();
            os::set_exit_status(1);
        }
        Ok(()) => ()
    }
}

unsafe fn arg_is_dash_g(arg: *u8) -> bool {
    *arg == '-' as u8 &&
        *arg.offset(1) == 'g' as u8 &&
        *arg.offset(2) == 0
}

#[start]
fn start(argc: int, argv: **u8) -> int {
    if argc > 1 && unsafe { arg_is_dash_g(*argv.offset(1)) } {
        green::start(argc, argv, rustuv::event_loop, main)
    } else {
        native::start(argc, argv, main)
    }
}

Activity

lilyball

lilyball commented on Apr 5, 2014

@lilyball
ContributorAuthor

The two solutions I can think of for this that seem reasonable are:

  1. When a read/write returns EAGAIN or EWOULDBLOCK, fall back to a call to select().
  2. When a read/write returns EAGAIN or EWOULDBLOCK, use fcntl() to turn off O_NONBLOCK and try again.
huonw

huonw commented on Apr 5, 2014

@huonw
Member
added and removed on Apr 5, 2014
alexcrichton

alexcrichton commented on Apr 5, 2014

@alexcrichton
Member

Surely we can't be the first project/language that has run in to this. I would be curious what other languages/runtimes do in the face of this error.

I don't think that libuv handles this test case specifically, I think it just falls out of the general architecture of libuv.

lilyball

lilyball commented on Apr 5, 2014

@lilyball
ContributorAuthor

@alexcrichton I suspect other projects/languages either don't have an opinion on O_NONBLOCK and expect the clients of the I/O APIs to deal with it, or they break. I'm curious if any explicitly address this problem.

I have a tentative fix over at kballard/rust/libnative_io_nonblock, although I suspect that hardcoding the values for F_GETFL/F_SETFL/O_NONBLOCK isn't portable and I'd like to find out how to get libc to contain these values.

alexcrichton

alexcrichton commented on Apr 5, 2014

@alexcrichton
Member

Hm, regardless of the values of the flags, I'm not sure that's the best solution. It's not guaranteed that every file descriptor passed in to libnative wants blocking reads/writes. In theory you could pass one in, and then have a select/epoll loop running somewhere else.

So far this problem seems isolated to only the stdout/stderr file descriptors, so I think those are the only ones that should be modified.

lilyball

lilyball commented on Apr 5, 2014

@lilyball
ContributorAuthor

@alexcrichton I don't see how you can restrict it to only stdout/stderr. Those can be dup'd or redirected, so you'd have to have some way, inside of inner_read() or inner_write() to unambiguously detect that this fd is actually stdout or stderr, and modify them appropriately. Modifying it at any other time opens you to the fd changing again (e.g. by another process modifying the tty device back to O_NONBLOCK). And I don't think you can unambiguously detect this anyway (certainly checking if fd == 1 isn't sufficient because it could have been dup'd).

The alternative implementation I considered was to fall back to select() instead of turning off O_NONBLOCK, but I thought this was a better solution because in most cases it's only going to be triggered once (assuming the problem exists in the first place), but the select() approach would end up falling back to select() quite often.

If, in Rust code, you want to deal with non-blocking I/O, I think you need to roll your own on top of libc. At least, until such time as someone comes up with a good proposal for non-blocking I/O in libnative/libgreen. But for the moment, clients of libnative (and libgreen) expect their read() and write() calls to be blocking.

54 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @lilyball@steveklabnik@alexcrichton@huonw@thestinger

      Issue actions

        libnative does not handle fds with O_NONBLOCK · Issue #13336 · rust-lang/rust