Skip to content

Many is_ascii_ methods are slower than they have to be #65127

Closed
@ghost

Description

I noticed this is_ascii check:

self.is_ascii() && (*self as u8).is_ascii_uppercase()
and wondered why it's there.
Then someone told me it's there because of the as u8. If you pass a non-ASCII char, the as u8 would corrupt that char. That's why the is_ascii check is there before it. But then I wondered: why isn't this just cast to a bigger data type that includes all Unicode code points? This method (and others) are slower than they have to be.
Also see this discussion in Discord with more valuable information: https://discordapp.com/channels/442252698964721669/443150878111694848/629982972052897792.

Activity

transferred this issue fromrust-lang/rfcson Oct 5, 2019
SimonSapin

SimonSapin commented on Oct 5, 2019

@SimonSapin
Contributor

(I can’t seem to find the relevant part in the middle of this discord stream…)

I’m not sure what you mean by “corrupt”. some_char as u8 is well defined: it takes the numerical value of the code point (same as some_char as u32), then truncates to keep the lower 8 bits. 'A' as u8 equals 0x41, but so does '🁁' as u8. We want to reject the latter since U+1F041 🁁 is not an ASCII upper-case letter. If char::is_ascii_uppercase is implemented in terms of u8::is_ascii_uppercase, the ASCII range check is necessary. (It could be a u8 range check to, for example with u8::try_from(u32).)

But maybe char::is_ascii_uppercase doesn’t need to be implemented in terms of u8::is_ascii_uppercase. Maybe it could duplicate the logic instead.

Once upon a time u8::is_ascii_uppercase was implemented with a lookup table of 256 entries. A flat lookup table for all of Unicode would be way too big. But these days it’s based on matching a u8 range pattern. Doing the same with a char range pattern could also work.

That said, would it actually be faster? Maybe the optimizer already does its job well. If you want to work on this, consider adding some benchmark to show that code duplication is worth it.

ghost

ghost commented on Oct 5, 2019

@ghost

Oh it seems the Discord link is a bit buggy. Try searching for Why is there this self.is_ascii() check? in the search bar on the Discord server. Then you should get to my message with the discussion below it. One guy also posted this comparison of the instructions generated: https://godbolt.org/z/RBADFT.

BurntSushi

BurntSushi commented on Oct 5, 2019

@BurntSushi
Member

I think the minimal standard here should be a benchmark before we change the implementation. Can you provide that?

ghost

ghost commented on Oct 5, 2019

@ghost

I'm not that experienced with Rust yet, sorry. I just wanted to point out this possible performance regression.
Shouldn't the instruction amount difference as shown on Godbolt suffice to know that it's slower?

cmp     edi, 128
jae     .LBB0_1
add     dil, -65
mov     al, 1
cmp     dil, 25
ja      .LBB0_1
ret

vs.

add     edi, -65
cmp     edi, 26
setb    al
ret
anirudhb

anirudhb commented on Oct 6, 2019

@anirudhb
Contributor

Looking at the assembly, it seems that the problem is the is_ascii check.

These three functions produce identical assembly:

pub fn foo(c: char) -> bool {
    match c as u8 {
        65..=90 => true,
        _ => false
    }
}

pub fn bar(c: char) -> bool {
    match c {
        'A'..='Z' => true,
        _ => false,
    }
}

pub fn baz(c: char) -> bool {
    (c as u8).is_ascii_uppercase()
}

So, it seems that the is_ascii check is unnecessary.

I could make a PR for this.

anirudhb

anirudhb commented on Oct 6, 2019

@anirudhb
Contributor

Here are some benchmarks:

#![feature(test)]

extern crate test;
use test::Bencher;

#[inline(always)]
fn custom_ascii_uppercase_without_ascii_check(c: char) -> bool {
    match c {
        'A'..='Z' => true,
        _ => false,
    }
}

#[inline(always)]
fn custom_ascii_uppercase_with_ascii_check_builtin_is_ascii(c: char) -> bool {
    c.is_ascii() && match c {
        'A'..='Z' => true,
        _ => false,
    }
}

#[inline(always)]
fn custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u8(c: char) -> bool {
    c as u8 <= 0x7fu8 && match c {
        'A'..='Z' => true,
        _ => false,
    }
}

#[inline(always)]
fn custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u32(c: char) -> bool {
    c as u32 <= 0x7fu32 && match c {
        'A'..='Z' => true,
        _ => false,
    }
}

#[bench]
fn bench_custom_ascii_uppercase_without_ascii_check(b: &mut Bencher) {
    let x = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    b.iter(|| {
        for c in x.chars() {
            test::black_box(custom_ascii_uppercase_without_ascii_check(c));
        }
    });
}

#[bench]
fn bench_custom_ascii_uppercase_with_ascii_check_builtin_is_ascii(b: &mut Bencher) {
    let x = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    b.iter(|| {
        for c in x.chars() {
            test::black_box(custom_ascii_uppercase_with_ascii_check_builtin_is_ascii(c));
        }
    });
}

#[bench]
fn bench_custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u8(b: &mut Bencher) {
    let x = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    b.iter(|| {
        for c in x.chars() {
            test::black_box(custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u8(c));
        }
    });
}

#[bench]
fn bench_custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u32(b: &mut Bencher) {
    let x = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    b.iter(|| {
        for c in x.chars() {
            test::black_box(custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u32(c));
        }
    });
}

#[bench]
fn bench_builtin_ascii_uppercase(b: &mut Bencher) {
    let x = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    b.iter(|| {
        for c in x.chars() {
            test::black_box(c.is_ascii_uppercase());
        }
    });
}
$ cargo bench
    Finished release [optimized] target(s) in 0.00s
     Running target/release/deps/_14178-f02baafefc742e50

running 5 tests
test bench_builtin_ascii_uppercase                                                ... bench:          79 ns/iter (+/- 11)
test bench_custom_ascii_uppercase_with_ascii_check_builtin_is_ascii               ... bench:          76 ns/iter (+/- 13)
test bench_custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u32 ... bench:          78 ns/iter (+/- 8)
test bench_custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u8  ... bench:          80 ns/iter (+/- 24)
test bench_custom_ascii_uppercase_without_ascii_check                             ... bench:          58 ns/iter (+/- 9)

test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 0 filtered out

So it seems that removing the is_ascii check has a measurable performance difference.

SimonSapin

SimonSapin commented on Oct 7, 2019

@SimonSapin
Contributor

custom_ascii_uppercase_with_ascii_check_custom_is_ascii_convert_to_u8 is incorrect. The conversion should happen not for the ASCII check but for the match, and the pattern use byte literals like b'A'. (Though I expect this won’t affect results much.)

added
C-enhancementCategory: An issue proposing an enhancement or a PR with one.
I-slowIssue: Problems and improvements with respect to performance of generated code.
T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.
on Oct 9, 2019
added a commit that references this issue on Feb 12, 2020

Rollup merge of rust-lang#67585 - ranma42:fix/char-is-ascii-codegen, …

79ebf53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @SimonSapin@BurntSushi@Alexendoo@anirudhb

      Issue actions

        Many is_ascii_ methods are slower than they have to be · Issue #65127 · rust-lang/rust