Description
Currently we have the uncommon_codepoints
lint, which lints on anything which is Identifier_Status=Restricted
.
It may be worth improving the diagnostics there by splitting it into multiple different specialized diagnostics. In the long run, some of these might be something that should be promoted to a separate lint so that they can individually be allowed.
The diagnostics I can think of are:
- One that calls out confusables with operators and syntax. we already have this for parse errors but not for lints post-parse. Unicode does not provide this data directly but we can construct it from unicode data easily.
- One that talks about Technical in general
- One that talks about Exclusion in general (this is "scripts that are dead")
- One that talks about Limited_Use in general (this is "scripts that are alive but not in widespread digital use yet")
- One that talks about Not_NFKC in general.
The first one can be implemented by taking the set of Rust syntax characters, expanding that to their confusables set, and then winnowing it down to the set of characters that is allowed in an identifier. This could belong in a separate check in the unicode-security crate.
The others can be implemented by checking the identifier_type()
of characters in the ident.
I might be able to mentor this, I can provide diagnostic text for these when needed.
Activity
∇x
gives "unknown start of token" compiler error #120142HTGAzureX1212 commentedon Jan 22, 2024
Hello, I'm interested in taking a go at this. Could anyone mentor me on this?
HTGAzureX1212 commentedon Jan 22, 2024
@rustbot claim
Manishearth commentedon Jan 22, 2024
I'm too busy in the coming weeks to fully mentor but I can answer questions. Please make a thread in the diagnostics channel on rust-lang.zulipchat.org and ask questions there, ccing me ("Manish Goregaokar").
Manishearth commentedon Jan 22, 2024
I would start by implementing the checks for 2-5 using the existing APIs.
The relevant code is here:
rust/compiler/rustc_lint/src/non_ascii_idents.rs
Line 193 in 021861a
You'll want that to emit a different lint message based on context.
The lint messages are pulled from a diagnostics type https://github.com/rust-lang/rust/blob/master/compiler/rustc_lint/src/lints.rs#L1111
which links to
rust/compiler/rustc_lint/messages.ftl
Line 243 in 021861a
I think the first change to make would actually be to make this diagnostic type contain a vector of characters, which it prints out as a list. Once we have that done, we should add more versions of it that have different messages, for Technical, Exclusion, etc.
Rollup merge of rust-lang#120259 - HTGAzureX1212:HTGAzureX1212/split-…
Rollup merge of rust-lang#120259 - HTGAzureX1212:HTGAzureX1212/split-…
9 remaining items