Closed
Description
...or at least that's how much I currently understand, since macros are really counterintuitive sometimes.
macro_rules! m {
( $i:ident ) => ();
( $t:tt $j:tt ) => ();
}
fn main() {
m!(c);
m!(t 9); // why does this work, but not the next case?
m!(0 9);
// ^ error: expected ident, found 0
}
I don't think this is supposed to happen. Even if it is, the exact rules used for macro matching should definitely be documented somewhere (I think there's not even an RFC).
Activity
durka commentedon Aug 14, 2015
It seems like the
ident
parser in particular is very greedy, or something. For example, changing it fromident
toexpr
eliminates the error (as does switching the order of the rules, but I assume this is reduced from something where you can't do that).However, fixing this (if it can be fixed) is probably a breaking change :(
jonas-schievink commentedon Aug 15, 2015
Found the culprit: https://github.com/rust-lang/rust/blob/master/src/libsyntax/ext/tt/macro_parser.rs#L514-L522
// this could be handled like a token, since it is one
Do you really think this would break something? I actually don't think it would, since a fix for this should only accept more macro invocations.
jonas-schievink commentedon Aug 15, 2015
On second thought... I think that's not the direct cause, since all other fragments behave roughly the same.
Oh well, back to the drawing board
jonas-schievink commentedon Aug 15, 2015
Okay, so as it turns out, all NT parsers introduce this bug (except
tt
, which you can't really test). In this particular case, changing fromident
toexpr
only worked because idents and integral literals are both valid expressions. For example, this doesn't work:Now that this confusion is out of the way, I think I see what happens: When parsing
_
as an expression, libsyntax panics because of a syntax error, which aborts compilation (the macro docs state that the parser fully commits to parsing such a nonterminal, so the fact that the third invocation doesn't work is expected behaviour).However, when parsing
1
, it works. But since the2
token wasn't eaten, the second arm is tried (this is the bug!). It matches, of course, and the macro is accepted.So, fixing this would indeed be a breaking change, since the intended behaviour is (at least as far as I know), to reject the invocation, but this isn't happening.
DanielKeep commentedon Aug 17, 2015
@jonas-schievink I disagree that attempting the second arm is a bug. Ideally,
macro_rules!
should attempt each rule until it finds one that matches. From my perspective, the bug is thatmacro_rules!
has no way to recover from failed parse attempts.Ideally, your last example should go something like this (using
^
to represent cursor positions):( 3 )
(^3 ) ; (^$b:expr )
→$b = 3
( 3^) ; ( $b:expr^)
→ matched( 1 2 )
(^1 2 ) ; (^$b:expr )
→$b = 1
( 1^2 ) ; ( $b:expr^)
→ input too long; next rule(^1 2 ) ; (^$t:tt $u:tt )
→$t = 1
( 1^2 ) ; ( $t:tt^$u:tt )
→$u = 2
( 1 2^) ; ( $t:tt $u:tt^)
→ matched( _ 1 )
(^_ 1 ) ; (^$b:expr )
→ syntax error; next rule.(^_ 1 ) ; (^$t:tt $u:tt )
→$t = _
( _^1 ) ; ( $t:tt^$u:tt )
→$u = 1
( _ 1^) ; ( $t:tt $u:tt^)
→ matchedjonas-schievink commentedon Aug 17, 2015
That would indeed be useful! But I think this comment implies that that isn't the intended behaviour:
rust/src/libsyntax/ext/tt/macro_parser.rs
Lines 11 to 13 in c115c51
DanielKeep commentedon Aug 18, 2015
@jonas-schievink I believe that's referring to how it parses within a rule. Earley parsers can deal with ambiguities by tracking multiple potential parse forests (if I remember correctly; my understanding is a little vague). What it's saying is that it has to commit to parsing a non-terminal (i.e. higher-level productions like expressions) because the parser doesn't have any way to back out of a partial parse. So when it encounters one, it has to parse it, come hell or high water.
Having the macro system not check successive rules once a rule starts matching would be apocalyptic: it would kill damn near every useful, non-trivial macro. We're talking mass hysteria, cats and dogs living together.
jonas-schievink commentedon Aug 18, 2015
Fair enough. In that case the bug is just that the macro expander will panic when it can't parse an NT, so it can't backtrack.
I also managed to dig up #3232, which was closed as "not a bug", but this definitely feels like one.
dylanede commentedon Jan 29, 2017
@jonas-schievink Sorry to dig this up again, but isn't an
ident
a terminal, so your comments about NTs aren't applicable to this bug?jonas-schievink commentedon Jan 29, 2017
@dylanede Correct! That's why I mentioned this comment:
(AFAIK, token == terminal)
14 remaining items