Skip to content

target-cpu=native should also check all the target-features #54688

@GabrielMajeri

Description

@GabrielMajeri
Contributor

Because LLVM doesn't know what target-cpu=native is, we currently convert this to an actual CPU family name, relying on the llvm::sys::getHostCPUName function. So if the current CPU's family is Haswell, we convert target-cpu=native to target-cpu=haswell.

As @EFanZh noticed here and here, some Intel Pentiums belong to the Haswell microarch, but they lack AVX.


Here are the flags Clang generates for -march=native:

$ clang -E - -march=native -###
-target-cpu skylake -target-feature +sse2 -target-feature +cx16 -target-feature +sahf
...
-target-feature -avx512ifma -target-feature -avx512dq

Not only does it set the right target-cpu for optimization purposes, it actually manually enables / disables all the features available / not available on the host. This ensures no matter what the microarch is, only available features will be used.


We should probably use the llvm::sys::getHostCPUFeatures to manually get a list of supported target features. This is what Clang does.

I think this issue will supersede both #38218 and #48464. If we can fix this, we will never generate code with unavailable target features.

Reproducing these errors is pretty hard, unless you have a machine with a Haswell Intel Pentium.
I think it might be possible to reproduce this in QEMU with -cpu Haswell,-avx

Activity

added
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
on Dec 9, 2018
gnzlbg

gnzlbg commented on Apr 23, 2019

@gnzlbg
Contributor

We should probably use the llvm::sys::getHostCPUFeatures to manually get a list of supported target features. This is what Clang does.

Alternatively, we could also use the is_{arch}_feature_detected! macros to do this. I'll work on this.

added a commit that references this issue on Aug 21, 2019
gendx

gendx commented on Jan 3, 2020

@gendx
Contributor

To add another example, target-cpu=native doesn't detect the aes feature on broadwell (see #67836).

added
C-enhancementCategory: An issue proposing an enhancement or a PR with one.
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Jan 3, 2020
as-com

as-com commented on Jan 6, 2021

@as-com
Contributor

I submitted a PR for a fix: #80749

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-enhancementCategory: An issue proposing an enhancement or a PR with one.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @nikic@gnzlbg@jonas-schievink@GabrielMajeri@as-com

      Issue actions

        target-cpu=native should also check all the target-features · Issue #54688 · rust-lang/rust