Skip to content

Lexer accidentally(?) does not use is_ascii_whitespace for literal whitespace in string continuations #136600

@hkBst

Description

@hkBst
Member

#108403 proposed to fix this, but it was claimed that the current behavior was documented in the reference in this comment. Incorrectly, as far as I can see, as that page only describes whitespace escapes as being \r, \t, and \n and the fix was about literal whitespace in string continuations. Now https://doc.rust-lang.org/reference/expressions/literal-expr.html#string-continuation-escapes does describe this behavior, but this was added later in Jan 2024. Indeed, this PR shows the reference documented skipping all whitespace, until Jun 13, 2022.

Current behavior has this ui test. It seems like this behavior was once implemented like it is now, then got claimed to be canon then got documented as canon. Anyway, I'm not sure why not all unicode whitespace is skipped, but just almost all ascii whitespace, but it seems important to pick an existing whitespace set, instead of using an old bad manual implementation of is_ascii_whitespace...

Perhaps we can see a crater run at least...

Activity

added
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Feb 5, 2025
added
A-parserArea: The lexing & parsing of Rust source code to an AST
T-langRelevant to the language team
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Feb 6, 2025
jieyouxu

jieyouxu commented on Feb 6, 2025

@jieyouxu
Member

cc @petrochenkov as I think you're familiar with this specific quirk

petrochenkov

petrochenkov commented on Feb 6, 2025

@petrochenkov
Contributor

I'm aware of it, but don't have any opinion on whether the behavior here should be changed or not (I also didn't do the archeology to find out why it is like it is).

removed
needs-triageThis issue may need triage. Remove it if it has been sufficiently triaged.
on Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-parserArea: The lexing & parsing of Rust source code to an ASTC-bugCategory: This is a bug.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-langRelevant to the language team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @petrochenkov@saethlin@hkBst@jieyouxu@rustbot

        Issue actions

          Lexer accidentally(?) does not use is_ascii_whitespace for literal whitespace in string continuations · Issue #136600 · rust-lang/rust