Closed
Description
enum NonAscii {
Abcd,
Éfgh,
}
use NonAscii::*;
fn f(x: NonAscii) -> bool {
match x {
Éfgh => true,
_ => false,
}
}
fn main() {
dbg!(f(Abcd));
}
Rustfmt on Playground (1.4.37-nightly (2021-06-12 24bdc6d)) fails to format the above code:
thread 'main' panicked at 'byte index 109 is not a char boundary; it is inside 'É' (bytes 108..110) of `enum NonAscii {
Abcd,
Éfgh,
}
use NonAscii::*;
fn f(x: NonAscii) -> bool {
match x {
Éfgh => true,
_ => false,
}
}
fn main() {
dbg!(f(Abcd));
}
`', src/tools/rustfmt/src/visitor.rs:47:15
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Activity
NichtsHsu commentedon Jun 18, 2021
@crlf0710
calebcartwright commentedon Jun 21, 2021
Haven't had a chance to check but assume that there's no issues with the pattern span, and that the most likely source is where we're trying to figure out if the pattern has a leading pipe and the corresponding byte position:
rustfmt/src/matches.rs
Lines 162 to 174 in 6495024
crlf0710 commentedon Jun 21, 2021
Basically the
mk_sp_lo_plus_one
function is incorrect, since utf-8 is a multibyte encoding format.calebcartwright commentedon Jun 22, 2021
I'm not sure about that. That function is correctly generating the expected/desired span, but we should avoid using that span-baed approach here to see if the pattern starts with a pipe given this case. If generating span lo/hi's with bytepos offsets is incorrect then we have much bigger problems 😄
I realize this is likely an annoying issue for folks so will take care of fixing this one myself