Skip to content

Better support for byte ordered reads and writes? #578

@yoshuawuyts

Description

@yoshuawuyts
Contributor

Something that came up today was the question how to read and write bytes with using a certain endianness. @goto-bus-stop replied in chat with the following:

let mut bytes = [0; 4];
input.read_all(&mut bytes)?;
let num = u32::from_le_bytes(bytes);

However they also pointed out that using byteorder one could do:

let num = input.read_u32::<LE>()?;

Which seems quite nice. A port of this functionality exists for Tokio in the form of tokio-byteorder. With support for the futures::io::{AsyncRead, AsyncWrite} currently in the works.

Design questions

People are currently already capable of reading and writing bytes with a certain endianness, without any issues. The hard parts are taken care of. However it doesn't quite feel ergonomic yet. So what I'm wondering is if we could perhaps improve the status quo here somewhat by providing support for this out of the box.

Writing bytes is fortunately already a one-liner:

use async_std::prelude::*;
use async_std::io::{self, prelude::*};

#[async_std::main]
async fn main () -> io::Result<()> {
    let mut stdout = io::stdout();
    stdout.write_all(&12_u16.to_le_bytes()).await?;
    Ok(())
}

Byteorder inspired

But reading bytes isn't yet. We could probably do better here, and I see a few options. The first is to follow byteorder's lead and add 16 methods on the Read trait, two Endianness enums, and a NativeEndian type alias:

use std::io::{self, Cursor, prelude::*, BigEndian};

#[async_std::main]
async fn main() -> io::Result<()> {
    let mut reader = Cursor::new(vec![2, 5, 3, 0]);
    assert_eq!(517, reader.read_u16::<BigEndian>().await?);
    assert_eq!(768, reader.read_u16::<BigEndian>().await?);
}

std inspired

Another option seems to be to add 48 new methods on the Read trait (3 endianness * 16 nums), and try to follow std's naming conventions more closely:

use std::io::{self, Cursor, prelude::*, BigEndian};

#[async_std::main]
async fn main() -> io::Result<()> {
    let mut reader = Cursor::new(vec![2, 5, 3, 0]);
    assert_eq!(517, reader.read_u16_be().await?);
    assert_eq!(768, reader.read_u16_be().await?);
}

using traits

The third option, and I have no idea if this works (we should test this) is to add two new methods on the Read trait, and a trait that we implement for all number types so we can be generic over them, and the method knows how to decode them:

use std::io::{self, Cursor, prelude::*, BigEndian};

#[async_std::main]
async fn main() -> io::Result<()> {
    let mut reader = Cursor::new(vec![2, 5, 3, 0]);
    assert_eq!(517_u16, reader.read_be_bytes().await?);
    assert_eq!(768_u16, reader.read_be_bytes().await?);
}

This last approach is somewhat iffy because it would show up in the function signature, which means we'd have to expose it (but wouldn't want people to implement it). Or we could make it a sealed trait, but I'm not a fan of doing that.

It seems like https://internals.rust-lang.org/t/pre-rfc-safe-transmute/11347 might be proposing a trait that could potentially cover this, but I'm unsure about the exact implications and relation to this. Maybe we should bring it up?

If we could find a way to make this work this would definitely be my preferred option, as it's easy to add a counterpart to Write as well (creating symmetry, and an even smaller one-liner). But that's a big if because there seem to be quite a few hurdles

Conclusion

I've talked about the current state of reading and writing bytes from async_std::io::{Read, Write}, and explored possible directions to improve this.

This is not something we need to find a solution for immediately, but it's something that if we can figure out it'll make writing certain programs easier for sure. Thanks!

Activity

tekjar

tekjar commented on Nov 23, 2019

@tekjar

jonhoo/tokio-byteorder#2

EDIT: Oops..Sorry..Didn't notice that you've already pointed this out

yoshuawuyts

yoshuawuyts commented on Dec 7, 2019

@yoshuawuyts
ContributorAuthor

From https://internals.rust-lang.org/t/pre-rfc-v2-safe-transmute/11431:

Transmute deals with in-memory data in-place, and thus does not have any provisions to perform translations between native endianness and non-native endianness.

So the traits from the transmute proposal won't work for us.

sdroege

sdroege commented on Dec 7, 2019

@sdroege

The latest tokio (0.2.3) has support for reading/writing integers, they chose read_u32 and that's always network byte order (aka big endian) and there are no little endian variants.

yoshuawuyts

yoshuawuyts commented on Dec 8, 2019

@yoshuawuyts
ContributorAuthor

Implemented the trait-based design for std's Read / Write types in
https://docs.rs/omnom:

use std::io::{Cursor, Seek, SeekFrom};
use omnom::prelude::*;

let mut buf = Cursor::new(vec![0; 15]);

// Write this u16 as little-endian bytes.
let num = 12_u16;
buf.write_le_bytes(num).unwrap();

buf.seek(SeekFrom::Start(0)).unwrap();

// Read a u16 from little-endian bytes.
let num: u16 = buf.read_le_bytes().unwrap();
assert_eq!(num, 12);

This feels like the right choice; very similar to std. Should be trivially portable to async-std as well.

yoshuawuyts

yoshuawuyts commented on Feb 17, 2020

@yoshuawuyts
ContributorAuthor

Oh also for the record: I wrote about this topic in long-form a while ago: https://blog.yoshuawuyts.com/byte-ordered-stream-parsing/

halvors

halvors commented on Dec 20, 2022

@halvors

is there any progress on this? Anything i can use to that has similar support as byteorder crate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    api designOpen design questions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @sdroege@halvors@yoshuawuyts@tekjar

        Issue actions

          Better support for byte ordered reads and writes? · Issue #578 · async-rs/async-std