Skip to content

Suggested spelling correction seems off-base #72553

Closed
@dtolnay

Description

@dtolnay
Member

In https://www.reddit.com/r/rust/comments/gpw2ra/how_is_the_rust_compiler_able_to_tell_the_visible/ I noticed this surprising spelling suggestion:

#![feature(non_ascii_idents)]

fn main() {
    let _ = 读文;
}
error[E0425]: cannot find value `读文` in this scope
   --> src/main.rs:45:13
    |
45  |     let _ = 读文;
    |             ^^^^ help: a tuple variant with a similar name exists: `Ok`

To me 读文 and Ok don't seem like they would be similar enough to meet the threshold for showing such a suggestion. Can we calibrate this better for short idents?

For comparison, even kO doesn't assume you mean Ok.

error[E0425]: cannot find value `kO` in this scope
  --> src/main.rs:45:13
   |
45 |     let _ = kO;
   |             ^^ not found in this scope

rustc 1.45.0-nightly (8970e8b 2020-05-23)

Mentioning @estebank who worked on suggestions most recently in #65421.

Activity

added
A-diagnosticsArea: Messages for errors, warnings, and lints
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
C-bugCategory: This is a bug.
on May 24, 2020
Enselic

Enselic commented on Nov 16, 2023

@Enselic
Member

I debugged it a bit. Ok is selected as the candidate because the edit distance is just 2 to 读文.

The code to fix is likely in or around fn find_best_match_for_name_impl. I tried some quick tricks but got ICEs and failing tests and gave up. One thing I didn't try that could perhaps work is to consider edit distance between characters of different alphabets/logogram sets as infinite, rather than 1, which is the case right now.

dtolnay

dtolnay commented on Nov 16, 2023

@dtolnay
MemberAuthor

Thank you for investigating!

Independent of what we do with different logogram sets (your suggestion sounds plausible), an edit distance of 2 for a string of length 2 should not meet the threshold for showing a suggestion, even within a single logogram set.

Enselic

Enselic commented on Nov 17, 2023

@Enselic
Member

I think edit distance of 1 for a string of length 1 is reasonable (a lot of existing UI tests relies on it), but maybe an edit distance of 2 for a string of length 2 is not reasonable indeed.

As you point out, with regular chars, the edit distance is still 2 but Ok is not suggested. So maybe there is a simpler fix to be made for 读文 than looking at what alphabets/logogram sets characters belong to.

added a commit that references this issue on Nov 27, 2023

Rollup merge of rust-lang#118381 - Enselic:edit-dist-len, r=WaffleLapkin

73070b1
added a commit that references this issue on Nov 28, 2023

Rollup merge of rust-lang#118381 - Enselic:edit-dist-len, r=WaffleLapkin

344459e
added a commit that references this issue on Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-diagnosticsArea: Messages for errors, warnings, and lintsA-suggestion-diagnosticsArea: Suggestions generated by the compiler applied by `cargo fix`C-bugCategory: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @Enselic@dtolnay@JohnTitor

      Issue actions

        Suggested spelling correction seems off-base · Issue #72553 · rust-lang/rust