Skip to content

Macro matchers only match when they feel like it #27832

Closed
@jonas-schievink

Description

@jonas-schievink
Contributor

...or at least that's how much I currently understand, since macros are really counterintuitive sometimes.

macro_rules! m {
    ( $i:ident ) => ();
    ( $t:tt $j:tt ) => ();
}

fn main() {
    m!(c);
    m!(t 9);  // why does this work, but not the next case?

    m!(0 9);
    // ^ error: expected ident, found 0
}

I don't think this is supposed to happen. Even if it is, the exact rules used for macro matching should definitely be documented somewhere (I think there's not even an RFC).

Activity

added
A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)
on Aug 14, 2015
durka

durka commented on Aug 14, 2015

@durka
Contributor

It seems like the ident parser in particular is very greedy, or something. For example, changing it from ident to expr eliminates the error (as does switching the order of the rules, but I assume this is reduced from something where you can't do that).

However, fixing this (if it can be fixed) is probably a breaking change :(

jonas-schievink

jonas-schievink commented on Aug 15, 2015

@jonas-schievink
ContributorAuthor

Found the culprit: https://github.com/rust-lang/rust/blob/master/src/libsyntax/ext/tt/macro_parser.rs#L514-L522

// this could be handled like a token, since it is one

Do you really think this would break something? I actually don't think it would, since a fix for this should only accept more macro invocations.

jonas-schievink

jonas-schievink commented on Aug 15, 2015

@jonas-schievink
ContributorAuthor

On second thought... I think that's not the direct cause, since all other fragments behave roughly the same.

Oh well, back to the drawing board

jonas-schievink

jonas-schievink commented on Aug 15, 2015

@jonas-schievink
ContributorAuthor

Okay, so as it turns out, all NT parsers introduce this bug (except tt, which you can't really test). In this particular case, changing from ident to expr only worked because idents and integral literals are both valid expressions. For example, this doesn't work:

macro_rules! m {
    ( $b:expr ) => ();
    ( $t:tt $u:tt ) => ();
}

fn main() {
    m!(3);      // works trivially
    m!(1 2);    // works, since `1` is a valid expression
    m!(_ 1);    // doesn't work, since `_` is not an expression (but a valid TT, of course)
}

Now that this confusion is out of the way, I think I see what happens: When parsing _ as an expression, libsyntax panics because of a syntax error, which aborts compilation (the macro docs state that the parser fully commits to parsing such a nonterminal, so the fact that the third invocation doesn't work is expected behaviour).

However, when parsing 1, it works. But since the 2 token wasn't eaten, the second arm is tried (this is the bug!). It matches, of course, and the macro is accepted.

So, fixing this would indeed be a breaking change, since the intended behaviour is (at least as far as I know), to reject the invocation, but this isn't happening.

DanielKeep

DanielKeep commented on Aug 17, 2015

@DanielKeep
Contributor

@jonas-schievink I disagree that attempting the second arm is a bug. Ideally, macro_rules! should attempt each rule until it finds one that matches. From my perspective, the bug is that macro_rules! has no way to recover from failed parse attempts.

Ideally, your last example should go something like this (using ^ to represent cursor positions):

  • ( 3 )
    • (^3 ) ; (^$b:expr )$b = 3
    • ( 3^) ; ( $b:expr^) → matched
  • ( 1 2 )
    • (^1 2 ) ; (^$b:expr )$b = 1
    • ( 1^2 ) ; ( $b:expr^) → input too long; next rule
    • (^1 2 ) ; (^$t:tt $u:tt )$t = 1
    • ( 1^2 ) ; ( $t:tt^$u:tt )$u = 2
    • ( 1 2^) ; ( $t:tt $u:tt^) → matched
  • ( _ 1 )
    • (^_ 1 ) ; (^$b:expr ) → syntax error; next rule.
    • (^_ 1 ) ; (^$t:tt $u:tt )$t = _
    • ( _^1 ) ; ( $t:tt^$u:tt )$u = 1
    • ( _ 1^) ; ( $t:tt $u:tt^) → matched
jonas-schievink

jonas-schievink commented on Aug 17, 2015

@jonas-schievink
ContributorAuthor

Ideally, macro_rules! should attempt each rule until it finds one that matches.

That would indeed be useful! But I think this comment implies that that isn't the intended behaviour:

//! This is an Earley-like parser, without support for in-grammar nonterminals,
//! only by calling out to the main rust parser for named nonterminals (which it
//! commits to fully when it hits one in a grammar). This means that there are no
(stating that the macro parser fully commits to NTs implies to me that it doesn't backtrack to try other arms)

DanielKeep

DanielKeep commented on Aug 18, 2015

@DanielKeep
Contributor

@jonas-schievink I believe that's referring to how it parses within a rule. Earley parsers can deal with ambiguities by tracking multiple potential parse forests (if I remember correctly; my understanding is a little vague). What it's saying is that it has to commit to parsing a non-terminal (i.e. higher-level productions like expressions) because the parser doesn't have any way to back out of a partial parse. So when it encounters one, it has to parse it, come hell or high water.

Having the macro system not check successive rules once a rule starts matching would be apocalyptic: it would kill damn near every useful, non-trivial macro. We're talking mass hysteria, cats and dogs living together.

jonas-schievink

jonas-schievink commented on Aug 18, 2015

@jonas-schievink
ContributorAuthor

I believe that's referring to how it parses within a rule.

Fair enough. In that case the bug is just that the macro expander will panic when it can't parse an NT, so it can't backtrack.

I also managed to dig up #3232, which was closed as "not a bug", but this definitely feels like one.

self-assigned this
on Jan 12, 2017
dylanede

dylanede commented on Jan 29, 2017

@dylanede
Contributor

@jonas-schievink Sorry to dig this up again, but isn't an ident a terminal, so your comments about NTs aren't applicable to this bug?

jonas-schievink

jonas-schievink commented on Jan 29, 2017

@jonas-schievink
ContributorAuthor

@dylanede Correct! That's why I mentioned this comment:

// this could be handled like a token, since it is one

(AFAIK, token == terminal)

14 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

A-macrosArea: All kinds of macros (custom derive, macro_rules!, proc macros, ..)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @jdm@durka@DanielKeep@jeberger@jonas-schievink

    Issue actions

      Macro matchers only match when they feel like it · Issue #27832 · rust-lang/rust