Closed
Description
Bug report
Bug description:
blake2module.c
may include _hacl/Hacl_Hash_Blake2b_Simd256.h
and/or _hacl/Hacl_Hash_Blake2s_Simd128.h
, and thus needs to compile with LIBHACL_SIMD128_FLAGS
and/or LIBHACL_SIMD256_FLAGS
, or you get an error like this if the compiler doesn't enable those SIMD features by default:
In file included from ./Modules/blake2module.c:139:
In file included from ./Modules/_hacl/Hacl_Hash_Blake2b_Simd256.h:42:
In file included from ./Modules/_hacl/libintvector.h:28:
/Library/Developer/CommandLineTools/usr/bin/../lib/clang/7.0.2/include/smmintrin.h:28:2: error: "SSE4.1 instruction set not enabled"
I would create a PR but unfortunately I can't figure out where this needs to be added.
CPython versions tested on:
3.14
Operating systems tested on:
macOS
Linked PRs
- gh-130213: Check availability of Intel SIMD types #130332
- GH-130213: clang-cl: blake2module.c needs to "see" the AVX intrinsics during compiling #130447
- gh-130213: POC let blake2module.c see far less internals #130483
- gh-130213: update hacl_star_rev to 322f6d58290e0ed7f4ecb84fcce12917aa0f594b #130960
Metadata
Metadata
Assignees
Labels
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
picnixz commentedon Feb 17, 2025
Aren't we setting them in
configure
? In addition we have:So, the headers shouldn't even be included if
HACL_*
are not definedpicnixz commentedon Feb 17, 2025
In addition,
-mavx2
should be sufficient to enable SSE 4.1, at least on gcc. I don't know whether this is the case on clang because I don't know how to check if-mavx2
implies-msse4.1
:picnixz commentedon Feb 17, 2025
Oh, I think I see what happens:
We might be missing some flag here indeed. OTOH, we have
in Setup.stdlib, but maybe we're missing a
-D_BSD_SOURCE -D_DEFAULT_SOURCE
here as SHA and MD5 modules are compiled with this.picnixz commentedon Feb 17, 2025
I won't have time to investigate more so I leave it to @msprotz
msprotz commentedon Feb 17, 2025
@jmroot I'm slightly confused by your report. It looks like merely including smmintrin.h triggers an error if you do not pass -msse4.1, as opposed to actually using definitions from smmintrin.h
@jmroot I dimly recall something like that happening on old clang versions. Can you check with a very recent clang please?
@picnixz blake2module.c should not be compiled with any -m flags, otherwise, the compiler might start introducing, say, SSE4.1 instructions in a section of code that is not guarded behind the run-time check for e.g. has_simd256(), which will generate illegal instruction errors if Python is executed on a machine that does not have these instructions.
The idea is that one should be able to include various *mmintrin.h headers without issues, but must guard their usage behind suitable runtime checks.
jmroot commentedon Feb 17, 2025
Yes, that seems to be the case.
It works fine with current clang of course, because
-msse4.1
is on by default.jmroot commentedon Feb 17, 2025
The trigger for the error in
smmintrin.h
is#ifndef __SSE4_1__
, so guarding the include with#ifdef __SSE4_1__
should be a partial solution.msprotz commentedon Feb 17, 2025
On my system (which definitely does not have avx512):
the header inclusion works just fine. But I try to use the header without the right -m flags:
Then I do get an actual error. So it appears on my system, you're allowed to include headers for intrinsics that are not currently enabled via suitable -m flags, but you do get an error if you try to use such intrinsics.
On your system, it appears that the behavior of the headers is much stricter.
What you suggest would work, but I would like to make sure I 100% understand why it's needed before I add it.
Thanks!
jmroot commentedon Feb 17, 2025
Slightly hard to tell given that with current clang from Xcode 16.2
But it's certainly consistent with the evidence.
I don't know what the policy is. The only requirement I'm aware of is C11. Disabling SIMD for older clang would also be an acceptable though less preferable resolution.
jmroot commentedon Feb 18, 2025
The other part of the problem is that one of the types defined in the intrinsics headers is always used (
__m256i
,__m128i
). Even when the includes are not guarded inlibintvector.h
,immintrin.h
will internally not includeavxintrin.h
if__AVX__
is not defined. Simply adding the typedef results in a link-time error:LLVM ERROR: Do not know how to split this operator's operand!
Disabling SIMD may well be the only easy fix here. What do you think would be the best way to do that? Perhaps a configure check for whether
smmintrin.h
can be included without-msse4.1
?msprotz commentedon Feb 18, 2025
This an omission on my end and on #130157 I mention that in the absence of intrinsics headers these types ought to be defined as void*.
I'm open to guarding the include for compatibility with old versions of clang. @picnixz thoughts?
msprotz commentedon Feb 19, 2025
@jmroot I re-read your message above. Just to recap (correct me if I missed something):
typedef __m128i LibIntVector_vec128
for instanceFixes include:
__m128i *
(and not__m128
), tweak libintvector.h to use a macro instead of a typedef, and locally define __m128 asvoid
so that all usages of the vector behind a pointer translate tovoid *
in the context of blake2module.c -- this seems like a terrible idea that might break in undebuggable ways#include <libintvector.h>
in their public headers, relying on forward struct definitions to hide the usage of vector types from the perspective of client filesOption 3. will take considerable time and effort, and option 2 is a big hack. I lean towards your proposed solution (option 1).
@jmroot do you know which version of clang it is that those headers started behaving "the right way"?
25 remaining items