Skip to content

Lexer: check in advance_token to avoid regard spare ## as GardedStrPrefix #141028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

xizheyin
Copy link
Contributor

@xizheyin xizheyin commented May 15, 2025

Fixes #140618

I performed additional checks to avoid treating the two ## after r#‘xxx’#### as GardedStrPrefix.

GardedStrPrefix is introduced in 321a5db.

r? compiler

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels May 15, 2025
@xizheyin
Copy link
Contributor Author

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 15, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 15, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@rust-log-analyzer

This comment has been minimized.

@xizheyin xizheyin force-pushed the issue-140618 branch 2 times, most recently from 1ab58fc to 3fe5df4 Compare May 15, 2025 13:02
@xizheyin xizheyin changed the title Lexer: check ### to avoid regard ## as GardedStrPrefix Lexer: check in advance_token to avoid regard spare ## as GardedStrPrefix May 15, 2025
@xizheyin
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 15, 2025
@xizheyin
Copy link
Contributor Author

r? compiler

@rustbot rustbot assigned wesleywiser and unassigned BoxyUwU May 31, 2025
@wesleywiser
Copy link
Member

r? compiler

@rustbot rustbot assigned jdonszelmann and unassigned wesleywiser Jun 6, 2025
@jdonszelmann
Copy link
Contributor

heya, you now unconditionally look at the previous character every step of tokenization. I'm admittedly somewhat new to review rotation, and don't have a good intuition to how much lexing matters in perf. My expectation is 0 but I'm interested in the outcome of a perf run because of that. Will be reviewing more, but initial pass looks reasonable @bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 6, 2025
@bors
Copy link
Collaborator

bors commented Jun 6, 2025

⌛ Trying commit 3fe5df4 with merge 606582d...

bors added a commit that referenced this pull request Jun 6, 2025
Lexer: check in `advance_token` to avoid regard spare `##` as `GardedStrPrefix`

Fixes #140618

I performed additional checks to avoid treating the two `##` after `r#‘xxx’####` as `GardedStrPrefix`.

`GardedStrPrefix` is introduced in <321a5db>.

r? compiler
@bors
Copy link
Collaborator

bors commented Jun 7, 2025

☀️ Try build successful - checks-actions
Build commit: 606582d (606582da13e1b4b4ec1c1e80afd6ce9cb2755789)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (606582d): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 5.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
5.4% [5.4%, 5.4%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 5.4% [5.4%, 5.4%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (secondary 0.0%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
0.0% [0.0%, 0.0%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Bootstrap: 752.053s -> 750.495s (-0.21%)
Artifact size: 372.47 MiB -> 372.53 MiB (0.02%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jun 7, 2025
@xizheyin
Copy link
Contributor Author

xizheyin commented Jun 7, 2025

This was based on a very old base, so I rebased it to the new master branch.

I don't think this will have much of an impact on performance, however I'm not quite sure why max-rss is increasing by 5%, is it compared to the latest master branch? Maybe we should rerun @rust-timer queue. But I have no sufficient permissions...

@rust-timer

This comment was marked as off-topic.

1 similar comment
@rust-timer

This comment was marked as off-topic.

@xizheyin
Copy link
Contributor Author

xizheyin commented Jun 7, 2025

I'm curious as to how these permissions were obtained. Because a couple of previous reviewers said r=me after resolving something, yet I don't have permissions to do that either.

@jdonszelmann
Copy link
Contributor

jdonszelmann commented Jun 7, 2025

both reviews and perf runs can only be done by team members. Sometimes a reviewer says r=me which effectively means any team member can merge it, the review has been done. For you it means, that's the last problem with your PR, after that someone (maybe them) will merge. Sometimes a reviewer can delegate these powers to non team members, but you can't do this yourself.

@@ -371,6 +371,7 @@ pub fn is_ident(string: &str) -> bool {
impl Cursor<'_> {
/// Parses a token from the input string.
pub fn advance_token(&mut self) -> Token {
let pre_char = self.prev();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.prev() is only for debug builds (read its documentation)

@jdonszelmann
Copy link
Contributor

In general, it feels to me like these checks should be part of raw_or_double_quoted_string. i.e., once a string has been parsed, and we see more hashtags, error on that

@xizheyin
Copy link
Contributor Author

xizheyin commented Jun 7, 2025

it will parse the last # of #"foo"### as GuardedStrPrefix, which is error.

I think this recognition should be corrected in advance_token. But I also don't know why prev() is only used in debug, do we need to remove that debug attribute or find another way?

@jdonszelmann
Copy link
Contributor

jdonszelmann commented Jun 7, 2025

No it's there for a reason, if you look prev is only used in debug assertions. We don't really want to look behind, so you likely want to find another way to do this

@xizheyin
Copy link
Contributor Author

xizheyin commented Jun 7, 2025

Okay, I'll re-investigate when I get some free time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Raw strings w/ too many terminating hashes are not exclusively handled
8 participants