-
Notifications
You must be signed in to change notification settings - Fork 299
Open
Labels
Description
Steps for implementing an intrinsic:
- Select an intrinsic below
- Review
coresimd/arm/neon.rs
andcoresimd/aarch64/neon.rs
- Consult ARM official documentation about your intrinsic
- Consult godbolt for how the intrinsic should be codegen'd, using clang as an example. Use the links below and replace the name of the intrinsic in the code with your intrinsic. Note that if ARM is an error then your intrinsic may be AArch64-only
- If the codegen is the same on ARM/AArch64, place the intrinsic in
coresimd/arm/neon.rs
. If it's different place it in both with appropriate#[cfg]
incoresimd/arm/neon.rs
. If it's only AArch64 place it incoresimd/aarch64/neon.rs
- Write a test for your intrinsic at the bottom of the file as well
- Test! Probably use
rustup run nightly sh ci/run-docker.sh aarch64-unknown-linux-gnu
. - When ready, send a PR!
lain-dono, ohsayan, codehippo, boozook, fzyzcjy and 4 moreyisiblboozook and yisibl
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
Select code repository
Activity
gnzlbg commentedon Oct 24, 2017
stdsimd
too #460oconnor663 commentedon Nov 15, 2018
Is there a blocker for these, or is it just finding time to do it? I'd like to help, but I'd need a more experienced compiler/SIMD person to point me in the right direction.
gnzlbg commentedon Nov 15, 2018
I can mentor. Start by taking a look at some of the intrinsics in the
coresimd/aarch64/neon.rs
module :)oconnor663 commentedon Nov 16, 2018
Is there some upstream source that these all get copied from, or are they actually written by hand?
gnzlbg commentedon Nov 16, 2018
I am not sure I understand the question ? The
neon
modules in this repository are written by hand, although @Amanieu has expressed interest into generating some parts of them automatically.oconnor663 commentedon Nov 16, 2018
gnzlbg commentedon Nov 16, 2018
Ah, I see, that would be the ARM NEON spec: https://developer.arm.com/technologies/neon/intrinsics
alexcrichton commentedon Dec 20, 2018
Now might be a great time to help make some more progress on this! We've got tons of intrinsics already implemented (thanks @gnzlbg!), and I've just implemented automatic verification of all added intrinsics, so we know if they're added they've got the correct signature at least!
I've updated the OP of this issue with more detailed instructions about how to bind NEON intrinsics. Hopefully it's not too bad any more!
We'll probably want to reorganize modules so they're a bit smaller and more manageable over time, but for now if anyone's interested to add more intrinsics and needs some help let me know!
25 remaining items
SparrowLii commentedon Oct 21, 2021
@CryZe They can be found in the master branch now:
https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/src/aarch64/neon/generated.rs#L8519-L8539
https://github.com/rust-lang/stdarch/blob/master/crates/core_arch/src/aarch64/neon/generated.rs#L8545-L8565
Sorry I marked them before #1230 merged, this is to prevent others from submitting duplicate PRs
CryZe commentedon Oct 21, 2021
Welp, I'll mark them again then. Somehow the GitHub Pull Request UI doesn't show them as diffs at all: https://i.imgur.com/BsHR5in.gif
SparrowLii commentedon Oct 21, 2021
Github’s comparison tool will always have problems when changing a large amount of code XD
SparrowLii commentedon Oct 21, 2021
As in #1230, except for the following instructions and those use 16-bit floating-point, other instructions have been implemented:
The following instructions are only available in aarch64 now, because the corresponding
target_feature
cannot be found in the available features of arm:vcadd_rot
、vcmla
、vdot
The feature
i8mm
is not valid:vmmla
、vusmmla
: https://rust.godbolt.org/z/8GbKW5ef4LLVM ERROR(Can be reproduced in godbolt):
vsm4e
: https://rust.godbolt.org/z/xhT1xvGTPLLVM ERROR(Normal in gotbolt, but
LLVM ERROR: Cannot select: intrinsic
raises at runtime)vsudot
、vusdot
: https://rust.godbolt.org/z/aMnEvab3nvqshlu
: https://rust.godbolt.org/z/hvGhrhdMTNot implmented in LLVM and cannot be implemented manually:
vmull_p64
(for arm)、vsm3
、vrax1q_u64
、vxarq_u64
、vrnd32
、vrnd64
、vsha512
Amanieu commentedon Oct 21, 2021
On LLVM's ARM backend,
vcadd_rot
andvcmla
are under thev8.3a
feature.vdot
is under thedotprod
feature. I got this information fromllvm-project/llvm/lib/Target/ARM/ARMInstrNEON.td
.Already discussed in rust-lang/rust#90079.
Use
llvm.aarch64.crypto.sm4ekey
instead ofllvm.aarch64.sve.sm4ekey
.You need to make you test function
pub
in godbolt, otherwise it will be optimized away as unreachable by rustc before LLVM.vsudot
/vusdot
require thei8mm
target feature.vqshlu
seems to work fine in godbolt after changing thepub
.These all seem to exist in LLVM at least for AArch64. For ARM we can just leave these out for now.
SparrowLii commentedon Oct 25, 2021
Hope someone can help implement the remaining instructions.
SparrowLii commentedon Nov 9, 2021
@Amanieu
v8.5a
feature is non-runtime detected so we can't use#[simd_test(enable = "neon,v8.5a")]
. So how do we add tests for instructions that usev8.5a
, likevrnd32x
andvrnd64x
?hkratz commentedon Nov 9, 2021
@SparrowLii Shouldn't that work with the
frintts
feature?SparrowLii commentedon Nov 9, 2021
Looks useful: https://rust.godbolt.org/z/894W8cndG
Amanieu commentedon Nov 9, 2021
LLVM only supports
frintts
on AArch64, so it's fine to not support this intrinsic on ARM.