mirror of
https://github.com/rust-lang/rust.git
synced 2026-06-03 01:16:14 +03:00
3779b8e32e
This improves the codegen for vector `select`, `gather`, `scatter` and boolean reduction intrinsics and fixes rust-lang/portable-simd#316. The current behavior of most mask operations during llvm codegen is to truncate the mask vector to <N x i1>, telling llvm to use the least significat bit. The exception is the `simd_bitmask` intrinsics, which already used the most signifiant bit. Since sse/avx instructions are defined to use the most significant bit, truncating means that llvm has to insert a left shift to move the bit into the most significant position, before the mask can actually be used. Similarly on aarch64, mask operations like blend work bit by bit, repeating the least significant bit across the whole lane involves shifting it into the sign position and then comparing against zero. By shifting before truncating to <N x i1>, we tell llvm that we only consider the most significant bit, removing the need for additional shift instructions in the assembly.
The files here use the LLVM FileCheck framework, documented at https://llvm.org/docs/CommandGuide/FileCheck.html.
One extension worth noting is the use of revisions as custom prefixes for FileCheck. If your codegen test has different behavior based on the chosen target or different compiler flags that you want to exercise, you can use a revisions annotation, like so:
// revisions: aaa bbb
// [bbb] compile-flags: --flags-for-bbb
After specifying those variations, you can write different expected, or
explicitly unexpected output by using <prefix>-SAME: and <prefix>-NOT:,
like so:
// CHECK: expected code
// aaa-SAME: emitted-only-for-aaa
// aaa-NOT: emitted-only-for-bbb
// bbb-NOT: emitted-only-for-aaa
// bbb-SAME: emitted-only-for-bbb