Skip to content

Implement & use metadata on arguments in LLVM #50156

Closed
@hanna-kruppe

Description

@hanna-kruppe
Contributor

We use metadata attached to LLVM instructions to tell LLVM passes about the range of possible integers that a value can take on. This allows additional optimizations. For example, when loading from a &NonZeroU32, the LLVM IR looks like this:

  %2 = load i32, i32* %1, align 4, !range !33

where !33 refers to a module-level entry like this (usually located near the end of the module source code):

!33 = !{i32 1, i32 0}

This tells LLVM that it's UB if the load returns anything that is not in the range ">= 1, < 0" (with wrap-around, i.e. it includes all values except 0). This allows x.get() != 0 (where x: &NonZeroU32) to be optimized to true, for example.

Unfortunately LLVM does not have a way to attach this information to function arguments, which leads to missed optimizations (e.g. #49572, #49420 (comment)). Adding this capability has been discussed before on llvm-dev and received positively, but was never actually implemented. If we want to use this in Rust, we're likely going to have to implement (& upstream) it outselves.

This issue tracks:

  • Implementing metadata-on-arguments in LLVM (mentoring instructions)
    Upstreaming / backporting that LLVM patch
    Emitting range metadata on arguments from rustc

cc @nox @eddyb

Activity

nox

nox commented on Apr 22, 2018

@nox
Contributor

Nice! Thanks for filing this and being willing to mentor its implementation.

hanna-kruppe

hanna-kruppe commented on Apr 22, 2018

@hanna-kruppe
ContributorAuthor

cc #50157 which is much easier but strongly related and complementary.

hanna-kruppe

hanna-kruppe commented on Apr 22, 2018

@hanna-kruppe
ContributorAuthor

Mentoring instructions for the LLVM changes:

Currently metadata can be attached to instructions, and attributes can be attached to both function arguments and parameters, as well as the return value. For metadata-on-arguments we'll draw inspiration from the APIs for both. The implementation can be simpler than both of those, though you may still want to look through it. For that, and in general for browsing LLVM code, I recommend https://code.woboq.org/llvm/llvm/ which offers go-to-definition, go-to-declaration, find-all-uses, etc.

We don't need metadata on return values (you can already attach metadata to the call instruction, and this indeed already works for range metadata today). I also don't see a need for metadata on the values passed to a call (e.g. there also isn't range metadata on stores). So we restrict outselves to the formal parameters of the function definition -- confusingly called Argument in the LLVM code base. I think we should add an optional collection of metadata to the Function object (one per parameter), and both Function and Argument would gain APIs for adding, querying, and removing metadata from individual parameters.

The APIs would be similar to what Instruction offers for metadata, though pared down -- we don't need anything related to debug metadata, for example. However, like with attributes, there should probably be APIs on both Function (taking an ArgNo integer to identify the argument) and on Argument (delegating to Function and pass in their own ArgNo). Take a look at how it's done for attributes).

Metadata on instructions is implemented in the superclass Instruction and it's fairly complicated to avoid wasting storage for the majority of instructions that don't have metadata. The actual metadata is stored out-of-line in the LLVMContext and Instruction only stores one bit indicating whether there's any metadata. We don't need that, we can just have an optional pointer to an array with one MDAttachmentMap per parameter, mirroring the Argument *Arguments; member. See the Instruction methods for examples of how to use the MDAttachmentMap.

Once that is done, passes need to be updated to query the new metadata when analyzing Arguments. I haven't looked at this part in detail yet, but going through the existing uses of LLVMContext::MD_range will almost certainly suffice. But anyone who wants to dive into this should ping me so I can help or take over.

added
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
C-enhancementCategory: An issue proposing an enhancement or a PR with one.
WG-llvmWorking group: LLVM backend code generation
on Apr 24, 2018
nox

nox commented on May 6, 2018

@nox
Contributor

@rkruppe Should the Function class have a new MDAttachmentMap * field and handle memory allocation of the array itself, or should that go in a separate type?

hanna-kruppe

hanna-kruppe commented on May 6, 2018

@hanna-kruppe
ContributorAuthor

I think in analogy with Function::Arguments it's fine to open-code that in Function.

eddyb

eddyb commented on May 6, 2018

@eddyb
Member

Would this change mean that attributes like nonnull would get replaced with metadata?
Actually, don't we have to use !nonnull for pointers and !range for integers?
So nonnull would become !nonnull (how? this should be backwards-compatible).

hanna-kruppe

hanna-kruppe commented on May 6, 2018

@hanna-kruppe
ContributorAuthor

Yes, I believe this would obsolete some attributes on formal parameters. For argument values at specific call sites you'd still use attributes, but I don't think callsite annotations are useful for nonnull specifically. So yeah, nonnull probably becomes obsolete with this change.

hanna-kruppe

hanna-kruppe commented on May 6, 2018

@hanna-kruppe
ContributorAuthor

Reviewing the list of attributes in the Language Reference, metadata-on-parameters could also obsolete dereferencable(<n>) and dereferencable_or_null(<n>) if corresponding metadata is added.

Other optimization hints that could conceivably become metadata are noalias, nocapture and returned, but these are very function-parameter-specific so turning them into metadata that can be applied to other values may not make sense. There's !noalias metadata but it's not obvious to me whether the noalias attribute can be exactly expressed with that metadata.

added
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Mar 18, 2020

15 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.WG-llvmWorking group: LLVM backend code generation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @eddyb@kennytm@nox@jonas-schievink@hanna-kruppe

      Issue actions

        Implement & use metadata on arguments in LLVM · Issue #50156 · rust-lang/rust