Commit Graph

320 Commits

Author SHA1 Message Date
Stuart Cook 8b036c7b72 Rollup merge of #152486 - fneddy:s390x_simplify_backchain, r=dingxiangfei2009
remove redundant backchain attribute in codegen

llvm will look at both
1. the values of `"target-features"` and
2. the function string attributes.

this patch removes the redundant function string attribute because it is not needed at all. rustc sets the `+backchain` attribute through `target_features_attr(...)`
https://github.com/rust-lang/rust/blob/d34f1f931489618efffc4007e6b6bdb9e10f6467/compiler/rustc_codegen_llvm/src/attributes.rs#L590
https://github.com/rust-lang/rust/blob/d34f1f931489618efffc4007e6b6bdb9e10f6467/compiler/rustc_codegen_llvm/src/attributes.rs#L326-L337
2026-02-13 15:19:13 +11:00
Eddy (Eduard) Stefes c6f57becfc remove redundant backchain attribute in codegen
llvm will look at both
1. the values of "target-features" and
2. the function string attributes.

this removes the redundant function string attribute because it is not needed at all.
rustc sets the `+backchain` attribute through `target_features_attr(...)`
2026-02-12 09:58:25 +01:00
Jacob Pratt b1b6533077 Rollup merge of #142680 - beetrees:sparc64-float-struct-abi, r=tgross35
Fix passing/returning structs with the 64-bit SPARC ABI

Fixes the 64-bit SPARC part of rust-lang/rust#115609 by replacing the current implementation with a new implementation modelled on the RISC-V calling convention code ([SPARC ABI reference](https://sparc.org/wp-content/uploads/2014/01/SCD.2.4.1.pdf.gz)).

Pinging `sparcv9-sun-solaris` target maintainers: @psumbera @kulikjak
Fixes rust-lang/rust#115336
Fixes rust-lang/rust#115399
Fixes rust-lang/rust#122620
Fixes https://github.com/rust-lang/rust/issues/147883
r? @workingjubilee
2026-02-12 00:41:05 -05:00
Folkert de Vries c9b5c934ca Fix passing/returning structs with the 64-bit SPARC ABI
Co-authored-by: beetrees <b@beetr.ee>
2026-02-10 12:39:45 +01:00
bors 1c316d3461 Auto merge of #152361 - JonathanBrouwer:rollup-Qkwz1vN, r=JonathanBrouwer
Rollup of 5 pull requests

Successful merges:

 - rust-lang/rust#151869 (add test for codegen of SIMD vector from array repeat)
 - rust-lang/rust#152077 (bootstrap: always propagate `CARGO_TARGET_{host}_LINKER`)
 - rust-lang/rust#126100 (Reword the caveats on `array::map`)
 - rust-lang/rust#152275 (Stop having two different alignment constants)
 - rust-lang/rust#152325 (Remove more adhoc groups that correspond to teams)
2026-02-08 21:42:19 +00:00
Jonathan Brouwer b566ac2c47 Rollup merge of #151869 - folkertdev:simd-array-repeat, r=Mark-Simulacrum
add test for codegen of SIMD vector from array repeat

fixes https://github.com/rust-lang/rust/issues/97804

It appears that this issue was fixed silently in LLVM 19. The original codegen was terrible, but starting at LLVM 19 `opt` is able to generate good code.

https://llvm.godbolt.org/z/5vq8scP6q

cc @programmerjake
2026-02-08 21:06:28 +01:00
Jonathan Brouwer 16c7ee5c05 Rollup merge of #151640 - ZuseZ4:cleanup-datatransfer, r=nnethercote
Cleanup offload datatransfer

There are 3 steps to run code on a GPU: Copy data from the host to the device, launch the kernel, and move it back.
At the moment, we have a single variable describing the memory handling to do in each step, but that makes it hard for LLVM's opt pass to understand what's going on. We therefore split it into three variables, each only including the bits relevant for the corresponding stage.

cc @jdoerfert @kevinsala

r? compiler
2026-02-08 19:15:26 +01:00
Manuel Drehwald 6de0591c0b Split ol mapper into more specific to/kernel/from mapper and move init_all_rtls into global ctor 2026-02-07 17:34:39 -08:00
Jonathan Brouwer 27d6b3c9b7 Rollup merge of #151576 - tgross35:stabilize-cold-path, r=jhpratt
Stabilize `core::hint::cold_path`

`cold_path` has been around unstably for a while and is a rather useful tool to have. It does what it is supposed to and there are no known remaining issues, so stabilize it here (including const).

Newly stable API:

```rust
// in core::hint
pub const fn cold_path();
```

I have opted to exclude `likely` and `unlikely` for now since they have had some concerns about ease of use that `cold_path` doesn't suffer from. `cold_path` is also significantly more flexible; in addition to working with boolean `if` conditions, it can be used in `match` arms, `if let`, closures, and other control flow blocks. `likely` and `unlikely` are also possible to implement in user code via `cold_path`, if desired.

Closes: https://github.com/rust-lang/rust/issues/136873 (tracking issue)

---

There has been some design and implementation work for making `#[cold]` function in more places, such as `if` arms, `match` arms, and closure bodies. Considering a stable `cold_path` will cover all of these usecases, it does not seem worth pursuing a more powerful `#[cold]` as an alternative way to do the same thing. If the lang team agrees, then:

Closes: https://github.com/rust-lang/rust/issues/26179
Closes: https://github.com/rust-lang/rust/pull/120193
2026-02-07 13:06:35 +01:00
Jonathan Brouwer f163864627 Rollup merge of #152128 - zmodem:matches-logical-or-141497, r=nikic
Adopt matches-logical-or-141497.rs to LLVM HEAD

After http://github.com/llvm/llvm-project/pull/178977, the and + icmp are folded to trunc.
2026-02-05 08:32:57 +01:00
Jonathan Brouwer b66ead827c Rollup merge of #152020 - Sa4dUs:offload-remove-dummy-loads, r=ZuseZ4
Remove dummy loads on offload codegen

The current logic generates two dummy loads to prevent some globals from being optimized away. This blocks memtransfer loop hoisting optimizations, so it's time to remove them.

r? @ZuseZ4
2026-02-05 08:32:45 +01:00
Hans Wennborg 23e5b2499f Adopt matches-logical-or-141497.rs to LLVM HEAD
After http://github.com/llvm/llvm-project/pull/178977, the and + icmp
are folded to trunc.
2026-02-04 19:20:10 +01:00
bors db3e99bbab Auto merge of #150605 - RalfJung:fallback-intrinsic-skip, r=mati865
skip codegen for intrinsics with big fallback bodies if backend does not need them

This hopefully fixes the perf regression from https://github.com/rust-lang/rust/pull/148478. I only added the intrinsics with big fallback bodies to the list; it doesn't seem worth the effort of going through the entire list.

Fixes https://github.com/rust-lang/rust/issues/149945
Cc @scottmcm @bjorn3
2026-02-04 17:12:58 +00:00
Marcelo Domínguez 212c8c3811 Remove dummy loads 2026-02-04 15:26:56 +01:00
Jonathan Brouwer 9fd5712bf5 Rollup merge of #151526 - ZuseZ4:fix-autodiff-codegen-tests, r=oli-obk
Fix autodiff codegen tests

Preparing autodiff for release on nightly. Since we haven't been running these tests in CI, they regressed over the last months. These changes fixes this and hopefully make the tests more robust for the future.

r? compiler
2026-02-04 14:39:19 +01:00
Folkert de Vries ef7a7809c7 add test for simd from array repeat codegen 2026-02-03 22:44:44 +01:00
Jacob Pratt e2c5b89d2a Rollup merge of #151958 - chahar-ritik:add-slp-vectorization-test, r=jieyouxu
Add codegen test for SLP vectorization

close: rust-lang/rust#142519

This PR adds a codegen regression test for rust-lang/rust#142519. A regression in LLVM to fail to auto-vectorize, leading to significant performance loss.

The SLP vectorizer correctly groups the 4-byte operations into <4 x i8> vectors.

The loop state is maintained in SIMD registers (phi <4 x i8>).

The test remains robust across architectures (AArch64 vs x86_64) by allowing flexible store types (i32 or <4 x i8>).
2026-02-02 23:12:05 -05:00
ltdk 28feae0c87 Move bigint helper tracking issues 2026-02-02 18:45:26 -05:00
Ritik Chahar 8476e893e7 Update min-llvm-version: 22
Co-authored-by: Nikita Popov <github@npopov.com>
2026-02-02 16:47:09 +05:30
ritik chahar 6176945223 fix: remove space for tidy and only for x86_64 2026-02-02 16:05:08 +05:30
ritik chahar 0830a5a928 fix: add min-llvm-version 2026-02-02 15:44:50 +05:30
ritik chahar 95ac5673ce Fix SLP vectorization test CHECK patterns 2026-02-02 15:38:26 +05:30
ritik chahar c64f9a0fc4 Add backlink to issue 2026-02-02 07:38:14 +05:30
ritik chahar 1c396d24dd Restrict test to x86_64 per reviewer feedback 2026-02-01 22:14:13 +05:30
ritik chahar 0a60bd653d fix: remove trailing newline for tidy 2026-02-01 22:09:05 +05:30
ritik chahar 2292d53b7b Add codegen test for SLP vectorization 2026-02-01 21:41:43 +05:30
Nikita Popov acb5ee2f84 Disable append-elements.rs test with debug assertions
The IR is a bit different (in particular wrt naming) if
debug-assertions-std is enabled. Peculiarly, the issue goes away
if overflow-check-std is also enabled, which is why CI did not
catch this.
2026-01-30 13:01:22 +01:00
Stuart Cook 3d102a7812 Rollup merge of #150893 - ZuseZ4:move-un-register-lib, r=oli-obk
offload: move (un)register lib into global_ctors

Right now we initialize the openmp/offload runtime before every single offload call, and tear it down directly afterwards.
What we should rather do is initialize it once in the binary startup code, and tear it down at the end of the binary execution. Here I implement these changes.

Together, our generated IR has a lot less usage of globals, which in turn simplifies the refactoring in https://github.com/rust-lang/rust/pull/150683, where I introduce a new variant of our offload intrinsic.

r? oli-obk
2026-01-28 19:03:51 +11:00
Manuel Drehwald 35ce8ab120 adjust testcase for new logic 2026-01-27 10:43:21 -08:00
Stuart Cook 1c892e829c Rollup merge of #147436 - okaneco:eq_ignore_ascii_autovec, r=scottmcm
slice/ascii: Optimize `eq_ignore_ascii_case` with auto-vectorization

- Refactor the current functionality into a helper function
- Use `as_chunks` to encourage auto-vectorization in the optimized chunk processing function
- Add a codegen test checking for vectorization and no panicking
- Add benches for `eq_ignore_ascii_case`

---

The optimized function is initially only enabled for x86_64 which has `sse2` as part of its baseline, but none of the code is platform specific. Other platforms with SIMD instructions may also benefit from this implementation.

Performance improvements only manifest for slices of 16 bytes or longer, so the optimized path is gated behind a length check for greater than or equal to 16.

Benchmarks - Cases below 16 bytes are unaffected, cases above all show sizeable improvements.
```
before:
    str::eq_ignore_ascii_case::bench_large_str_eq         4942.30ns/iter +/- 48.20
    str::eq_ignore_ascii_case::bench_medium_str_eq         632.01ns/iter +/- 16.87
    str::eq_ignore_ascii_case::bench_str_17_bytes_eq        16.28ns/iter  +/- 0.45
    str::eq_ignore_ascii_case::bench_str_31_bytes_eq        35.23ns/iter  +/- 2.28
    str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq       7.56ns/iter  +/- 0.22
    str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq    2.64ns/iter  +/- 0.06
after:
    str::eq_ignore_ascii_case::bench_large_str_eq         611.63ns/iter +/- 28.29
    str::eq_ignore_ascii_case::bench_medium_str_eq         77.10ns/iter +/- 19.76
    str::eq_ignore_ascii_case::bench_str_17_bytes_eq        3.49ns/iter  +/- 0.39
    str::eq_ignore_ascii_case::bench_str_31_bytes_eq        3.50ns/iter  +/- 0.27
    str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq      7.27ns/iter  +/- 0.09
    str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq   2.60ns/iter  +/- 0.05
```
2026-01-27 17:36:35 +11:00
Trevor Gross 2e365985be Stabilize core::hint::cold_path
`cold_path` has been around unstably for a while and is a rather useful
tool to have. It does what it is supposed to and there are no known
remaining issues, so stabilize it here (including const).

Newly stable API:

    // in core::hint
    pub const fn cold_path();

I have opted to exclude `likely` and `unlikely` for now since they have
had some concerns about ease of use that `cold_path` doesn't suffer
from. `cold_path` is also significantly more flexible; in addition to
working with boolean `if` conditions, it can be used in `match` arms,
`if let`, closures, and other control flow blocks. `likely` and
`unlikely` are also possible to implement in user code via `cold_path`,
if desired.
2026-01-26 13:43:06 -06:00
Jonathan Pallant 6ecb3f33f0 Adds two new Tier 3 targets - aarch64v8r-unknown-none and aarch64v8r-unknown-none-softfloat.
The existing `aarch64-unknown-none` target assumes Armv8.0-A as a baseline. However, Arm recently released the Arm Cortex-R82 processor which is the first to implement the Armv8-R AArch64 mode architecture. This architecture is similar to Armv8-A AArch64, however it has a different set of mandatory features, and is based off of Armv8.4. It is largely unrelated to the existing Armv8-R architecture target (`armv8r-none-eabihf`), which only operates in AArch32 mode.

The second `aarch64v8r-unknown-none-softfloat` target allows for possible Armv8-R AArch64 CPUs with no FPU, or for use-cases where FPU register stacking is not desired. As with the existing `aarch64-unknown-none` target we have coupled FPU support and Neon support together - there is no 'has FPU but does not have NEON' target proposed even though the architecture technically allows for it.

This PR was developed by Ferrous Systems on behalf of Arm. Arm is the owner of these changes.
2026-01-26 12:43:52 +00:00
bors 873d4682c7 Auto merge of #151337 - the8472:bail-before-memcpy2, r=Mark-Simulacrum
optimize `vec.extend(slice.to_vec())`, take 2

Redoing https://github.com/rust-lang/rust/pull/130998
It was reverted in https://github.com/rust-lang/rust/pull/151150 due to flakiness. I have traced this to layout randomization perturbing the test (the failure reproduces locally with layout randomization), which is now excluded.
2026-01-25 19:45:35 +00:00
Matthias Krüger 0de96f455d Rollup merge of #151405 - heiher:fix-cli, r=Mark-Simulacrum
LoongArch: Fix call-llvm-intrinsics test
2026-01-25 16:27:23 +01:00
Matthias Krüger f6a8326a99 Rollup merge of #151404 - heiher:fix-dae, r=Mark-Simulacrum
LoongArch: Fix direct-access-external-data test

On LoongArch targets, `-Cdirect-access-external-data` defaults to `no`. Since copy relocations are not supported, `dso_local` is not emitted under `-Crelocation-model=static`, unlike on other targets.
2026-01-25 16:27:22 +01:00
Matthias Krüger 9dffb21112 Rollup merge of #150065 - is57primenumber:add-slice-cse-test, r=Mark-Simulacrum
add CSE optimization tests for iterating over slice

This PR is regression test for issue rust-lang/rust#119573.
This PR introduces a new regression test to verify a critical optimization known as Common Subexpression Elimination (CSE) is correctly applied during various slice iteration patterns.
2026-01-25 07:42:59 +01:00
Matthias Krüger b651be2191 Rollup merge of #145393 - clubby789:issue-138497, r=Mark-Simulacrum
Add codegen test for removing trailing zeroes from `NonZero`

Closes rust-lang/rust#138497
2026-01-25 07:42:56 +01:00
bors 75963ce795 Auto merge of #151065 - nagisa:add-preserve-none-abi, r=petrochenkov
abi: add a rust-preserve-none calling convention

This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.

For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
2026-01-25 02:49:32 +00:00
Matthias Krüger 3a69035338 Rollup merge of #151346 - folkertdev:simd-splat, r=workingjubilee
add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- https://github.com/rust-lang/rust/issues/60637
- https://github.com/rust-lang/rust/issues/137407
- https://github.com/rust-lang/rust/issues/122623
- https://github.com/rust-lang/rust/issues/97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
2026-01-24 21:04:15 +01:00
Simonas Kazlauskas 6db94dbc25 abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.

For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
2026-01-24 19:23:17 +02:00
Folkert de Vries 71f34429ac const-eval: do not call immediate_const_vector on vector of pointers 2026-01-24 10:40:47 +01:00
Manuel Drehwald b6d567c12c Shorten the autodiff batching test, to make it more reliable 2026-01-23 23:18:52 -08:00
Jonathan Brouwer 13f0399a57 Rollup merge of #151259 - bonega:fix-is-ascii-avx512, r=folkertdev
Fix is_ascii performance regression on AVX-512 CPUs when compiling with -C target-cpu=native

## Summary

This PR fixes a severe performance regression in `slice::is_ascii` on AVX-512 CPUs when compiling with `-C target-cpu=native`.

On affected systems, the current implementation achieves only ~3 GB/s for large inputs, compared to ~60–70 GB/s previously (≈20–24× regression). This PR restores the original performance characteristics.

This change is intended as a **temporary workaround** for upstream LLVM poor codegen. Once the underlying LLVM issue is fixed and Rust is able to consume that fix, this workaround should be reverted.

  ## Problem

  When `is_ascii` is compiled with AVX-512 enabled, LLVM's auto-vectorization generates ~31 `kshiftrd` instructions to extract mask bits one-by-one, instead of using the efficient `pmovmskb`
  instruction. This causes a **~22x performance regression**.

  Because `is_ascii` is marked `#[inline]`, it gets inlined and recompiled with the user's target settings, affecting anyone using `-C target-cpu=native` on AVX-512 CPUs.

## Root cause (upstream)

The underlying issue appears to be an LLVM vectorizer/backend bug affecting certain AVX-512 patterns.

An upstream issue has been filed by @folkertdev  to track the root cause: llvm/llvm-project#176906

Until this is resolved in LLVM and picked up by rustc, this PR avoids triggering the problematic codegen pattern.

  ## Solution

  Replace the counting loop with explicit SSE2 intrinsics (`_mm_movemask_epi8`) that force `pmovmskb` codegen regardless of CPU features.

  ## Godbolt Links (Rust 1.92)

  | Pattern | Target | Link | Result |
  |---------|--------|------|--------|
  | Counting loop (old) | Default SSE2 | https://godbolt.org/z/sE86xz4fY | `pmovmskb` |
  | Counting loop (old) | AVX-512 (znver4) | https://godbolt.org/z/b3jvMhGd3 | 31x `kshiftrd` (broken) |
  | SSE2 intrinsics (fix) | Default SSE2 | https://godbolt.org/z/hMeGfeaPv | `pmovmskb` |
  | SSE2 intrinsics (fix) | AVX-512 (znver4) | https://godbolt.org/z/Tdvdqjohn | `vpmovmskb` (fixed) |

  ## Benchmark Results

  **CPU:** AMD Ryzen 5 7500F (Zen 4 with AVX-512)

  ### Default Target (SSE2) — Mixed

  | Size | Before | After | Change |
  |------|--------|-------|--------|
  | 4 B | 1.8 GB/s | 2.0 GB/s | **+11%** |
  | 8 B | 3.2 GB/s | 5.8 GB/s | **+81%** |
  | 16 B | 5.3 GB/s | 8.5 GB/s | **+60%** |
  | 32 B | 17.7 GB/s | 15.8 GB/s | -11% |
  | 64 B | 28.6 GB/s | 25.1 GB/s | -12% |
  | 256 B | 51.5 GB/s | 48.6 GB/s | ~same |
  | 1 KB | 64.9 GB/s | 60.7 GB/s | ~same |
  | 4 KB+ | ~68-70 GB/s | ~68-72 GB/s | ~same |

  ### Native Target (AVX-512) — Up to 24x Faster

  | Size | Before | After | Speedup |
  |------|--------|-------|---------|
  | 4 B | 1.2 GB/s | 2.0 GB/s | **1.7x** |
  | 8 B | 1.6 GB/s | 5.0 GB/s | **3.3x** |
  | 16 B | ~7 GB/s | ~7 GB/s | ~same |
  | 32 B | 2.9 GB/s | 14.2 GB/s | **4.9x** |
  | 64 B | 2.9 GB/s | 23.2 GB/s | **8x** |
  | 256 B | 2.9 GB/s | 47.2 GB/s | **16x** |
  | 1 KB | 2.8 GB/s | 60.0 GB/s | **21x** |
  | 4 KB+ | 2.9 GB/s | ~68-70 GB/s | **23-24x** |

  ### Summary

  - **SSE2 (default):** Small inputs (4-16 B) 11-81% faster; 32-64 B ~11% slower; large inputs unchanged
  - **AVX-512 (native):** 21-24x faster for inputs ≥1 KB, peak ~70 GB/s (was ~3 GB/s)

  Note: this is the pure ascii path, but the story is similar for the others.
  See linked bench project.

  ## Test Plan

  - [x] Assembly test (`slice-is-ascii-avx512.rs`) verifies no `kshiftrd` with AVX-512
  - [x] Existing codegen test updated to `loongarch64`-only (auto-vectorization still used there)
  - [x] Fuzz testing confirms old/new implementations produce identical results (~53M iterations)
  - [x] Benchmarks confirm performance improvement
  - [x] Tidy checks pass

  ## Reproduction / Test Projects

  Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

  - `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
  - `fuzz/` - Compares old/new implementations with libfuzzer

  ## Related Issues
  - issue opened by @folkertdev llvm/llvm-project#176906
  - Regression introduced in https://github.com/rust-lang/rust/pull/130733
2026-01-24 08:18:05 +01:00
Manuel Drehwald 7bcc8a7053 update abi handling test 2026-01-23 21:54:04 -08:00
Manuel Drehwald d7877615b4 Update test after new mangling scheme, make test more robust 2026-01-23 20:37:58 -08:00
Manuel Drehwald c0c6e2166d make generic test invariant of function order 2026-01-23 19:58:29 -08:00
bors 9283d592de Auto merge of #151389 - scottmcm:vec-repeat, r=joboet
Use `repeat_packed` when calculating layouts in `RawVec`

Seeing whether this helps the icounts seen in https://github.com/rust-lang/rust/pull/148769#issuecomment-3769921666
2026-01-23 07:24:11 +00:00
Andreas Liljeqvist c609cce8cf Merge is_ascii codegen tests using revisions
Combine the x86_64 and loongarch64 is_ascii tests into a single file
using compiletest revisions. Both now test assembly output:

- X86_64: Verifies no broken kshiftrd/kshiftrq instructions (AVX-512 fix)
- LA64: Verifies vmskltz.b instruction is used (auto-vectorization)
2026-01-22 22:18:00 +01:00
Matthew Maurer b639b0a4d8 llvm: Tolerate dead_on_return attribute changes
The attribute now has a size parameter and sorts differently:
* Explicitly omit size parameter during construction on 23+
* Tolerate alternate sorting in tests

https://github.com/llvm/llvm-project/pull/171712
2026-01-21 23:39:03 +00:00
Scott McMurray c3f309e32b Use repeat_packed when calculating layouts in RawVec 2026-01-21 01:11:12 -08:00