Closed
Description
Hello,
Due to the optimization introduced with #97046, the to_lowercase
conversion of some str
containing the Σ
char is incorrect.
Simple example:
fn main(){
println!("{}", "aΣ".to_lowercase());
println!("{}", "abcdefghijklmnopΣ".to_lowercase());
}
output:
aς
abcdefghijklmnopσ
The first conversion is correct while the second is not and should be abcdefghijklmnopς
.
More generally, this happens when 'Σ' follows a 2 * USIZE_SIZE * K
number of chars
use std::mem;
const USIZE_SIZE_BY_2: usize = 2 * mem::size_of::<usize>();
const K: usize = 1;
fn main() {
let bug_string = "a".repeat(USIZE_SIZE_BY_2 * K) + "Σ";
println!("{}", bug_string.to_lowercase());
}
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
Marcondiro commentedon May 4, 2024
@rustbot claim
jhorstmann commentedon May 5, 2024
@Marcondiro I tried optimizing
to_lowercase
in #123778. In that PR the whole ascii prefix is converted by an optimized loop, which probably makes the issue easier to trigger and maybe also easier to fix. Let me know if you'd like me to look into this.Fix handling of upper-case sigma (rust-lang#124714)
fix rust-lang#124714 str.to_lowercase sigma handling
fix rust-lang#124714 str.to_lowercase sigma handling
fix rust-lang#124714 str.to_lowercase sigma handling
fix rust-lang#124714 str.to_lowercase sigma handling
4 remaining items