Skip to content

io::Stdout should use block bufferring when appropriate #60673

Open
@BurntSushi

Description

@BurntSushi
Member

I feel like a pretty common pitfall for beginning Rust programmers is to try writing a program that uses println! to print a lot of lines, compare its performance to a similar program written in Python, and be (rightly) baffled at the fact that Python is substantially faster. This occurred most recently here: https://www.reddit.com/r/rust/comments/bl7j7j/hey_rustaceans_got_an_easy_question_ask_here/emx3bhm/

The reason why this happens is because io::Stdout unconditionally uses line buffering, regardless of whether it's being used interactively (e.g., printing to a console or a tty) or whether it's printing to a file. So if you print a lot of lines, you end up calling the write syscall for every line, which is quite expensive. In contrast, Python uses line buffering when printing interactively, and standard block bufferring otherwise. You can see more details on this here and here.

In my opinion, Rust should adopt the same policy as Python. Indeed, there is even a FIXME item for this in the code:

rust/src/libstd/io/stdio.rs

Lines 401 to 404 in ef01f29

// FIXME: this should be LineWriter or BufWriter depending on the state of
// stdout (tty or not). Note that if this is not line buffered it
// should also flush-on-panic or some form of flush-on-abort.
inner: Arc<ReentrantMutex<RefCell<LineWriter<Maybe<StdoutRaw>>>>>,

I think this would potentially solve a fairly large stumbling block that folks run into. The CLI working group even calls it out as a performance footgun. And also here too. Additionally, ripgrep rolls its own handling for this.

I can't think of too many appreciable downsides to doing this. It is a change in behavior. For example, if you wrote a Rust program today that printed to io::Stdout, and the user redirected the output to a file, then the user could (for example) tail that output and see it updated as each line was printed. If we made io::Stdout use block buffering when printing to a file like this, then that behavior would change. (This is the reasoning for flags like --line-buffered on grep.)

cc @rust-lang/libs

Activity

added
T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.
on May 9, 2019
BurntSushi

BurntSushi commented on May 9, 2019

@BurntSushi
MemberAuthor

cc @killercup @kbknapp as other folks that might have opinions here.

sfackler

sfackler commented on May 9, 2019

@sfackler
Member

If we're worried about regressing people that are depending on it being line buffered, we could minimally have methods on Stdout/Stderr to switch it between line and block buffering.

alexcrichton

alexcrichton commented on May 9, 2019

@alexcrichton
Member

FWIW I personally continue to feel that we can do this at any time (change libstd's buffering strategy on non-TTY stdout/stderr streams) and I agree with @sfackler that if breakage arises we can work around it with methods and such.

kbknapp

kbknapp commented on May 9, 2019

@kbknapp

I would be very much in favor of at least the minimal route of giving Stdout/Stderr the option to switch between line and block buffering.

Another slightly less minimalist approach is to use block buffering by default on print!("..") and to prominently displaying the characteristics of both macros in the docs. The downside being to change println!("..") calls to print!("..\n") is a multi cursor movement. A different approach is to add a pythonesque opt-in version of println!, ignoring the exact syntax as purely an example println!(buf=true, "..") which I believe could be done in a backwards compat way, and isn't a multi-cursor movement.

In general I'd like to switch wholesale, as it's one of the very common footguns I see.

Lonami

Lonami commented on May 11, 2019

@Lonami
Contributor

If we're worried about regressing people that are depending on it being line buffered […]

I wouldn't worry about this unless the documentation explicitly states the current behaviour (e.g. always line-buffered). If it's not documented, it's like relying on implementation details (which are subject to change).

BurntSushi

BurntSushi commented on May 12, 2019

@BurntSushi
MemberAuthor

I don't think we specify the behavior. But even if we don't, and we want to make this change (it sounds like folks agree we should), we should go into it while being considerate of behavioral changes to existing code. The letter of the law is important, but so is the spirit.

canadaduane

canadaduane commented on Jul 20, 2019

@canadaduane

Just to document further agreement with @BurntSushi that this is a common pitfall--here I am, a new user, doing it today, and asking around for help :)

https://users.rust-lang.org/t/why-is-this-rust-loop-3x-slower-when-writing-to-disk/30489

49 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-ioArea: `std::io`, `std::fs`, `std::net` and `std::path`C-enhancementCategory: An issue proposing an enhancement or a PR with one.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Participants

      @canadaduane@alexcrichton@RalfJung@BurntSushi@the8472

      Issue actions

        io::Stdout should use block bufferring when appropriate · Issue #60673 · rust-lang/rust