Description
This is an issue that needs to be resolved before stabilization of "Macros 1.2".
Procedural macros that we are going to stabilize currently have two flavors - proc_macro
and proc_macro_attribute
.
proc_macro
macros have signature fn(TokenStream) -> TokenStream
and can be invoked with "bang" forms like this:
my::proc::macro!( TOKEN_STREAM )
my::proc::macro![ TOKEN_STREAM ]
my::proc::macro! { TOKEN_STREAM }
Only the TOKEN_STREAM
part is passed to the macro as TokenStream
, the delimiters (brackets) are NOT passed.
Why this is bad:
- The macro doesn't know what delimiters it was invoked with.
It was a part of Macro 2.0 promise to give macros control over delimiters in their invocations, so e.g.vec
-like macros could require square brackets likevec![1, 2, 3]
and reject other brackets.
We should not prevent this kind of control being implemented in the future.
Why this is good:
- Brackets are mostly not a part of the "useful payload" for the macro, they are there so macro invocations could be parsed unambiguously in many context in which they can appear - expressions, types, blocks, modules, etc, etc, etc.
proc_macro_attribute
macros have signature fn(TokenStream, TokenStream) -> TokenStream
and can be invoked with "attribute" forms like this:
#[my::proc::macro TOKEN_STREAM] TARGET
#![my::proc::macro TOKEN_STREAM] TERGET
TARGET
is a trait/impl/foreign item, or a statement and it's passed to the macro as the second TokenStream
argument, but we are not interested in it right now.
The TOKEN_STREAM
part is passed to the macro as the first TokenStream
argument, nothing is ignored.
Why this is bad:
- It's not clear where the path ends and where the token stream starts.
Something like#[a::b :: + -]
seems to match the grammar, but is rejected right now because paths always parsed greedily so::
is interpreted as a path separator rather than a path of the token stream.
Annoying questions arise with generic arguments in paths like#[a<>::b::c<u8>]
. Technically this is a syntactically valid path andc
having type arguments is rather a semantic error and the empty<>
after the modulea
is not an error at all, but rigth now this attribute is interpreted as#[a /* <- PATH | TOKEN_STREAM -> */ <>::b::c<u8>]
.
Ideally we'd like to avoid these questions completely and have an unambiguous delimiter. - It's not clear where the token stream ends.
With plain#[attr TOKEN_STREAM]
it's pretty clear - the stream ends before the]
(in this sense the situation is simpler than with bang macros), but things start breaking when other macros appear.So with this attribute syntax we can't supportmacro m($meta1: meta, $meta2: meta) { ... } // No way to determine where the first attribute starts and the second attribute ends m!( a::b::c x , y , z , d::e::f u , v , w )
meta
anymore! - It's not consistent with
proc_macro
macros.m!(a, b, c)
does not include parentheses into the token stream, but#[m(a, b, c)]
does. - I'm not actually sure people intend to stabilize this attribute syntax suddenly expanded from traditional forms (
#[attr]
,#[attr(list)]
,#[attr = literal]
) to being nearly unlimited (i.e. something like#[a::b::c e f + c ,,, ;_:]
being legal) right now.
Proposed solution:
-
Stabilize
proc_macro
as is for "Macros 1.2". -
In the future extend the set of
proc_macro
plugin interfaces with one more signaturefn(TokenStream, Delimiter) -> TokenStream
that allows controlling delimiters used in macro invocations. -
In the future possibly support bang macro invocations without delimiters for symmetry with attributes and because they may be legitimately useful (
let x = MACRO_CONST!;
, see https://internals.rust-lang.org/t/idea-elide-parens-brackets-on-unparametrized-macros/6527) (theDelimiter
argument isDelimiter::None
in this case). -
Restrict attribute syntax accepted by
proc_macro_attribute
for "Macros 1.2" to// Symmetric with bang macro invocations #[my::proc::macro(TOKEN_STREAM)] #[my::proc::macro[TOKEN_STREAM]] #[my::proc::macro { TOKEN_STREAM }] // Additionally #[my::proc::macro] #[my::proc::macro = TOKEN_TREE]
Or, more radically, do not stabilize the
=
syntax for procedural macros 1.2.
This is not a fundamental restriction - arbitrary token streams still can be placed inside the brackets (#[a::b::c(e f + c ,,, ;_:)]
). -
The token stream passed to the macro DOES NOT include the delimiters.
-
In the future extend the set of
proc_macro_attribute
plugin interfaces with one more signaturefn(TokenStream, TokenStream, Delimiter) -> TokenStream
that allows controlling delimiters used in macro invocations (the delimiter isDelimiter::None
for both#[attr]
and#[attr = tt]
forms but they are still discernable by the token stream being empty or not).
Activity
petrochenkov commentedon Apr 18, 2018
cc @alexcrichton @dtolnay @withoutboats @eddyb
cc #49629
petrochenkov commentedon Apr 18, 2018
Alternatives:
proc_macro
andproc_macro_attribute
to includeDelimiter
before stabilization.proc_macro
before stabilization.abonander commentedon Apr 18, 2018
I don't know if this deserves its own issue, but I think there should be a way for a
proc_macro_attribute
to ask what kind of AST node (crate, item, statement, or expression) its input represents.abonander commentedon Apr 18, 2018
Actually #47786 is probably a better place to suggest that one.
alexcrichton commentedon Apr 18, 2018
Thanks for bringing these issues up @petrochenkov! Your proposed solutions sounds pretty good to me, but I wanted to clarify a point or two as well.
For
#[proc_macro]
I'd be fine either requiringDelimiter
today or adding a second signature down the road. I think I'd slightly prefer to have both options in the long run as most authors probably won't mind too much about whatDelimiter
is used, so I'd probably err on the side of leaving it as-is and possibly adding support for a new signature later on.For
#[proc_macro_attribute]
I think it's a great idea to limit the syntax you can possibly work with today. The whitelisted syntaxes you proposed above sound good to me, and do you also think we should limit paths to just one element? (aka disallow#[foo::bar]
).I wanted to clarify, though, are you thinking the delimiter is dropped from the token stream going into
#[proc_macro_attribute]
as well? If we do that I think we would be required to stabilize and only support a signature that takes aDelimiter
(to differentiate#[foo]
and#[foo()]
). I agree though that in these worlds removing the#[foo = bar]
custom attribute is probably the best, and I don't think it'd be too hard to come up with alternate syntaxes for users today doing things like#[foo(baz = bar)]
.abonander commentedon Apr 18, 2018
@alexcrichton Absolute paths in attributes allow them to work at the crate root where they otherwise won't resolve due to scoping rules (#41430, attributes resolve in the parent module but the crate root has no parent). So unless we want to change the inner attribute form to resolve in the current module instead of the parent, absolute paths are the only way to call attributes at the crate root.
alexcrichton commentedon Apr 18, 2018
@abonander ah true yeah, but the first pass of stabilization of Macros 1.2 won't stabilize attributes on modules (or crates), only bare items like functions, structs, impls, traits, etc.
abonander commentedon Apr 18, 2018
@alexcrichton we're not currently feature gating attribute invocations on modules or at the crate root so that needs to be its own issue. It would be a bit more complex as we'd have to wait until the attribute resolves to a
#[proc_macro_attribute]
before emitting a feature gate error.alexcrichton commentedon Apr 18, 2018
Oh sure yeah when I say only allow one element that's just for now, we'd still, I'd imagine, allow absolute paths and more-than-one-element paths behind a feature gate.
abonander commentedon Apr 18, 2018
Absolute paths in attributes are already feature gated, actually. Would
#[feature(proc_macro)]
just imply that feature gate like it does now withuse_extern_macros
?alexcrichton commentedon Apr 18, 2018
Perhaps yeah, I might be more of a fan of finer-grained feature gates after the next round of stabilization, but either way is fine.
petrochenkov commentedon Apr 19, 2018
@alexcrichton
Yeah, I'm not sure what is better too and tend to leave things as is for now and introduce a separate signature later.
Yes (#35896 (comment)), but that falls more under the "macro modularisation" issue, so I didn't mention it again.
(If by limiting you mean not stabilizing multi-segment paths rather than "unimplementing" them).
Yes.
Differentiating between
#[foo]
and#[foo()]
is equivalent to differentiating betweenfoo!()
andfoo![]
, so I think we can certainly live without it and it's not required to introduce the signature withDelimiter
immediately.But if this differentiation is seemed sufficiently important, then we should implement/stabilize the
Delimiter
signature sooner rather than later for bothproc_macro
andproc_macro_attribute
.petrochenkov commentedon Apr 20, 2018
One more alternative is to keep the delimiter in
CURRENT_SESS
and extract it from there on demand like we do, for example, withSpan::call_site
.12 remaining items