Commit Graph

261 Commits

Author SHA1 Message Date
Matthew Maurer b639b0a4d8 llvm: Tolerate dead_on_return attribute changes
The attribute now has a size parameter and sorts differently:
* Explicitly omit size parameter during construction on 23+
* Tolerate alternate sorting in tests

https://github.com/llvm/llvm-project/pull/171712
2026-01-21 23:39:03 +00:00
Jacob Pratt 43d2006c25 Rollup merge of #150436 - va-list-copy, r=workingjubilee,RalfJung
`c_variadic`: impl `va_copy` and `va_end` as Rust intrinsics

tracking issue: https://github.com/rust-lang/rust/issues/44930

Implement `va_copy` as (the rust equivalent of) `memcpy`, which is the behavior of all current LLVM targets. By providing our own implementation, we can guarantee its behavior. These guarantees are important for implementing c-variadics in e.g. const-eval.

Discussed in [#t-compiler/const-eval > c-variadics in const-eval](https://rust-lang.zulipchat.com/#narrow/channel/146212-t-compiler.2Fconst-eval/topic/c-variadics.20in.20const-eval/with/565509704).

I've also updated the comment for `Drop` a bit. The background here is that the C standard requires that `va_end` is used in the same function (and really, in the same scope) as the corresponding `va_start` or `va_copy`. That is because historically `va_start` would start a scope, which `va_end` would then close. e.g.

https://softwarepreservation.computerhistory.org/c_plus_plus/cfront/release_3.0.3/source/incl-master/proto-headers/stdarg.sol

```c
#define         va_start(ap, parmN)     {\
        va_buf  _va;\
        _vastart(ap = (va_list)_va, (char *)&parmN + sizeof parmN)
#define         va_end(ap)      }
#define         va_arg(ap, mode)        *((mode *)_vaarg(ap, sizeof (mode)))
```

The C standard still has to consider such implementations, but for Rust they are irrelevant. Hence we can use `Clone` for `va_copy` and `Drop` for `va_end`.
2026-01-20 19:46:29 -05:00
Folkert de Vries dd9241d150 c_variadic: use Clone instead of LLVM va_copy 2026-01-20 18:38:50 +01:00
Nikita Popov 0be66603ac Avoid passing addrspacecast to lifetime intrinsics
Since LLVM 22 the alloca must be passed directly. Do this by
stripping the addrspacecast if it exists.
2026-01-20 14:47:04 +01:00
Stuart Cook 1262ff906b Rollup merge of #150288 - offload-bench-fix, r=ZuseZ4
Add scalar support for offload

This PR adds scalar support to the offload feature. The scalar management has two main parts:

On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that.
On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type.

r? @ZuseZ4
2026-01-20 18:00:08 +11:00
Marcelo Domínguez 307a4fcdf8 Add scalar support for both host and device 2026-01-19 22:28:42 +01:00
Jonathan Brouwer a56e2d3037 Rollup merge of #151071 - gen-openmp-metadata, r=nnethercote
Generate openmp metadata

LLVM has an openmp-opt pass, which is part of the default O3 pipeline.
The pass bails if we don't have a global called openmp, so let's generate it if people enable our experimental offload feature. openmp is a superset of the offload feature, so they share optimizations.
In follow-up PRs I'll start verifying that LLVM optimizes Rust the way we want it.

r? compiler
2026-01-19 08:31:31 +01:00
Manuel Drehwald 5c85d522d0 Generate global openmp metadata to trigger llvm openmp-opt pass 2026-01-16 14:57:32 -05:00
Jacob Pratt 6912c676cd Rollup merge of #150607 - dispatch-ptr-intrinsic, r=workingjubilee
Add amdgpu_dispatch_ptr intrinsic

There is an ongoing discussion in rust-lang/rust#150452 about using address spaces from the Rust language in some way.
As that discussion will likely not conclude soon, this PR adds one rustc_intrinsic with an addrspacecast to unblock getting basic information like launch and workgroup size and make it possible to implement something like `core::gpu`.

Add a rustc intrinsic `amdgpu_dispatch_ptr` to access the kernel dispatch packet on amdgpu.
The HSA kernel dispatch packet contains important information like the launch size and workgroup size.

The Rust intrinsic lowers to the `llvm.amdgcn.dispatch.ptr` LLVM intrinsic, which returns a `ptr addrspace(4)`, plus an addrspacecast to `addrspace(0)`, so it can be returned as a Rust reference.
The returned pointer/reference is valid for the whole program lifetime, and is therefore `'static`.
The return type of the intrinsic (`&'static ()`) does not mention the struct so that rustc does not need to know the exact struct type. An alternative would be to define the struct as lang item or add a generic argument to the function.
Is this ok or is there a better way (also, should it return a pointer instead of a reference)?

Short version:
```rust
#[cfg(target_arch = "amdgpu")]
pub fn amdgpu_dispatch_ptr() -> *const ();
```

Tracking issue: rust-lang/rust#135024
2026-01-15 19:35:46 -05:00
Jieyou Xu cd79ff2e2c Revert "avoid phi node for pointers flowing into Vec appends #130998"
This reverts PR <https://github.com/rust-lang/rust/pull/130998> because
the added test seems to be flaky / non-deterministic, and has been
failing in unrelated PRs during merge CI.
2026-01-15 09:37:16 +08:00
bors 86a49fd71f Auto merge of #130998 - the8472:bail-before-memcpy, r=nnethercote
avoid phi node for pointers flowing into Vec appends

Elide temporary allocations in patterns like `vec.append(slice.to_vec())`

related discussion: https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/nocapture.20and.20allocation.20elimination
2026-01-14 16:36:26 +00:00
Jonathan Brouwer b431a5e685 Rollup merge of #151067 - ui_test_no_should_fail, r=lqd
Avoid should-fail in two ui tests and a codegen-llvm test

`should-fail` is only meant for testing the compiletest framework itself. It checks that the test runner itself panicked.

With this there are still a bunch of rustdoc-html tests that use it due to this test suite not supporting anything like `//@ doc-fail`.
2026-01-14 11:05:40 +01:00
bjorn3 15112eee67 Avoid should-fail in a codegen-llvm test 2026-01-13 15:21:20 +00:00
Hans Wennborg 6ca950136d Relax test expectation for @__llvm_profile_runtime_user
After https://github.com/llvm/llvm-project/pull/174174 it has profile
info marking it cold.
2026-01-12 11:03:07 +01:00
The 8472 468eb45b3f avoid phi node for pointers flowing into Vec appends 2026-01-12 02:54:30 +01:00
Stuart Cook 30585ebbd3 Rollup merge of #150494 - extern_linkage_dso_local, r=bjorn3
Fix dso_local for external statics with linkage

Tracking issue of the feature: rust-lang/rust#127488

DSO local attributes are not correctly applied to extern statics with `#[linkage = "foo"]` as we generate an internal global for such statics, and the we evaluate (and apply) DSO attributes on the internal one instead.

Fix this by applying DSO local attributes on the actually extern ones, too.
2026-01-11 14:27:55 +11:00
Flakebi 91d4e40e02 Add amdgpu_dispatch_ptr intrinsic
Add a rustc intrinsic `amdgpu_dispatch_ptr` to access the kernel
dispatch packet on amdgpu.
The HSA kernel dispatch packet contains important information like the
launch size and workgroup size.

The Rust intrinsic lowers to the `llvm.amdgcn.dispatch.ptr` LLVM
intrinsic, which returns a `ptr addrspace(4)`, plus an addrspacecast to
`addrspace(0)`, so it can be returned as a Rust reference.

The returned pointer/reference is valid for the whole program lifetime,
and is therefore `'static`.

The return type of the intrinsic (`*const ()`) does not mention the
struct so that rustc does not need to know the exact struct type.
An alternative would be to define the struct as lang item or add a
generic argument to the function.

Short version:
```rust
#[cfg(target_arch = "amdgpu")]
pub fn amdgpu_dispatch_ptr() -> *const ();
```
2026-01-09 10:41:37 +01:00
Matthias Krüger 1494755275 Rollup merge of #150426 - ZuseZ4:offload-register-lib, r=davidtwco
Update offload test and verify that tgt_(un)register_lib have the right type

Apparently, we weren't running offload tests when Enzyme wasn't built. Time to fix that.
Also adds a test mode which generates the host IR, but does not expect device IR/artifacts. This way, we don't have to handle artifacts and paths in our tests.
Also removes some outdated documentation.

cc `@Kevinsala,` `@Sa4dUs`

closes: https://github.com/rust-lang/rust/issues/150415

~~blocked on `needs-offload` infrastructure landing in https://github.com/rust-lang/rust/pull/150427~~
2026-01-04 21:14:05 +01:00
Manuel Drehwald fa584faca5 Update test and verify that tgt_(un)register_lib have the right type 2026-01-04 06:58:31 -08:00
bors f57b9e6f56 Auto merge of #150564 - rwardd:rwardd/option_or_codegen_tests, r=scottmcm
Added codegen tests for different forms of `Option::or`

Adds tests to check the output of the different ways of writing `Option::or`

Fixes rust-lang/rust#124533
2026-01-03 22:47:35 +00:00
Ryan Ward a2fcb0de18 fix: add CHECK directives to ret comments and be more pervasive with directive contents 2026-01-03 12:50:38 +10:30
Ryan 3df06f5083 fix: use std::num::NonZero instead of extern crate and extend information in CHECK- directives
Co-authored-by: scottmcm <scottmcm@users.noreply.github.com>
2026-01-03 10:53:54 +10:30
bors 85c8ff69cb Auto merge of #150606 - JonathanBrouwer:rollup-lue4jqz, r=JonathanBrouwer
Rollup of 6 pull requests

Successful merges:

 - rust-lang/rust#150425 (mapping an error from cmd.spawn() in npm::install)
 - rust-lang/rust#150444 (Expose kernel launch options as offload intrinsic args)
 - rust-lang/rust#150495 (Correct hexagon "unwinder_private_data_size")
 - rust-lang/rust#150578 (Fix a typo in the docs of AsMut for rust-lang/rust#149609)
 - rust-lang/rust#150581 (mir_build: Separate match lowering for string-equality and scalar-equality)
 - rust-lang/rust#150594 (Fix typo in the docs of `CString::from_vec_with_nul`)

r? `@ghost`
`@rustbot` modify labels: rollup
2026-01-02 19:45:27 +00:00
bors 5497a36a7f Auto merge of #149658 - Enselic:non-zero-opt, r=Mark-Simulacrum
tests/codegen-llvm/some-non-zero-from-atomic-optimization.rs: New test

Closes rust-lang/rust#60044 which has one 👍 and one ❤️  vote and just **E-needs-test**.
2026-01-02 16:29:24 +00:00
Marcelo Domínguez 58e2610f71 Expose workgroup/thread dims as intrinsic args 2026-01-02 11:50:32 +01:00
Ryan Ward 66c4ead02d fix: added further CHECK-SAME labels and replaced all struct input tests with NonZero<u8> input 2026-01-02 12:54:17 +10:30
Ryan bf2078bfca fix: add CHECK-SAME labels to verify generated function type for u8 and [u8; 1] cases
Co-authored-by: scottmcm <scottmcm@users.noreply.github.com>
2026-01-02 12:01:55 +10:30
Ryan Ward 80acf74fb6 test: added codegen tests for permutations of Option::or 2026-01-01 22:28:32 +10:30
Martin Nordholts 55833a9a6d tests/codegen-llvm/some-non-zero-from-atomic-optimization.rs: New test 2025-12-31 15:22:07 +01:00
Jonathan Brouwer d898dccc21 Rollup merge of #150511 - Sa4dUs:offload-inline, r=ZuseZ4
Allow inline calls to offload intrinsic

Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block.

r? `@ZuseZ4`

fixes: https://github.com/rust-lang/rust/issues/150413
2025-12-31 14:30:48 +01:00
Marcelo Domínguez 41a24c4b58 Add offload test for control flow handling 2025-12-31 13:11:28 +01:00
Gary Guo 5467a398c2 Fix dso_local for external statics with linkage
The current code applies `dso_local` to the internal generated symbols
instead of the actually-externally one.
2025-12-29 19:26:34 +00:00
Gary Guo 1ff953d63e Fix and expand direct-access-external-data test
This test currently doesn't fulfill its purpose, as `external dso_local`
can still match `external {{.*}}`. Fix this by using CHECK-NOT directives.

Also, this test is expanded to all platforms where it makes sense, instead
of restricting to loongarch.
2025-12-29 19:26:34 +00:00
Ben Kimock 315646a7a0 Remove the explicit branch hint from const_panic 2025-12-29 10:30:02 -05:00
bors 000ccd651d Auto merge of #148766 - cjgillot:mir-const-runtime-checks, r=RalfJung,saethlin
Replace Rvalue::NullaryOp by a variant in mir::Operand.

Based on https://github.com/rust-lang/rust/pull/148151

This PR fully removes the MIR `Rvalue::NullaryOp`. After rust-lang/rust#148151, it was only useful for runtime checks like `ub_checks`, `contract_checks` and `overflow_checks`.

These are "runtime" checks, boolean constants that may only be `true` in codegen. It depends on a rustc flag passed to codegen, so we need to represent those flags cross-crate.

This PR replaces those runtime checks by special variants in MIR `ConstValue`. This allows code that expects constants to manipulate those as such, even if we may not always be able to evaluate them to actual scalars.
2025-12-22 06:58:28 +00:00
Camille Gillot c67b99fa09 Reinstate bonus for unused UbChecks. 2025-12-21 00:58:00 +00:00
Matthias Krüger 508c382080 Rollup merge of #149788 - Sa4dUs:offload-cleanup, r=ZuseZ4
Move shared offload globals and define per-kernel globals once

This PR moves the shared LLVM global variables logic out of the `offload` intrinsic codegen and generates kernel-specific variables only ont he first call of the intrinsic.

r? `@ZuseZ4`

tracking:
- https://github.com/rust-lang/rust/issues/131513
2025-12-19 23:38:57 +01:00
Marcelo Domínguez 04c2d2be13 Remove region_id unnamed attr 2025-12-19 13:27:14 +01:00
Marcelo Domínguez 3e4944d573 Split runtime global logic and cache kernel specific one 2025-12-19 13:27:13 +01:00
Ben Kimock 4ff2c5c9f5 Don't treat asserts as a call in cross-crate inlining 2025-12-18 19:12:09 -05:00
bors 95a27adcf9 Auto merge of #143924 - davidtwco:sve-infrastructure, r=workingjubilee
`rustc_scalable_vector(N)`

Supercedes rust-lang/rust#118917.

Initial experimental implementation of rust-lang/rfcs#3838. Introduces a `rustc_scalable_vector(N)` attribute that can be applied to types with a single `[$ty]` field (for `u{16,32,64}`, `i{16,32,64}`, `f{32,64}`, `bool`). `rustc_scalable_vector` types are lowered to scalable vectors in the codegen backend.

As with any unstable feature, there will necessarily be follow-ups as we experiment and find cases that we've not considered or still need some logic to handle, but this aims to be a decent baseline to start from.

See rust-lang/rust#145052 for request for a lang experiment.
2025-12-16 12:53:53 +00:00
David Wood a56b1b9283 codegen: implement repr(scalable)
Introduces `BackendRepr::ScalableVector` corresponding to scalable
vector types annotated with `repr(scalable)` which lowers to a scalable
vector type in LLVM.

Co-authored-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>
2025-12-16 11:00:12 +00:00
bors 61cc47e367 Auto merge of #149948 - WaffleLapkin:dereferenceablen't, r=RalfJung
Stop applying `dereferenceable(n)` to return types

It looks like the semantics of `dereferenceable(n)` on return types is "dereferenceable until the end of the program", which is not sound for how we were using it. See [dereferenceable on return type](https://rust-lang.zulipchat.com/#narrow/channel/136281-t-opsem/topic/LLVM.20dereferenceable.20on.20return.20type/with/563001493) zulip thread.

cc `@rust-lang/opsem` `@nikic`
2025-12-16 09:38:19 +00:00
Camille Gillot a02dc3487a Bless codegen. 2025-12-14 20:34:56 +00:00
Camille Gillot b6bb7f9645 Bless codegen test. 2025-12-14 17:25:53 +00:00
Camille Gillot 6319bee585 Introduce Operand::RuntimeChecks. 2025-12-14 17:25:53 +00:00
Camille Gillot 1a227bd47f Replace Rvalue::NullaryOp by a variant in mir::ConstValue. 2025-12-14 17:25:51 +00:00
Chris Denton 2f06db1cbe Rollup merge of #149773 - fneddy:fix_test_va_list_signext, r=Mark-Simulacrum
fix va_list test by adding a llvmir signext check

s390x has no option to directly pass 32bit values therefor i32 parameters need an optional llvmir signext attribute.
2025-12-14 09:18:29 +00:00
Waffle Lapkin e72613030e stop applying dereferenceable(n) to return types 2025-12-13 13:43:02 +01:00
Matthias Krüger b3948e5f10 Rollup merge of #149679 - pmur:murp/ppc-inline-improvements, r=Amanieu
Restrict spe_acc to PowerPC SPE targets

Update the tests, add powerpc-*-gnuspe testing, and create a distinct clobber_abi list for PowerPC SPE targets.

Note, the SPE target does not have vector, vector-scalar, or floating-point specific registers.

r? ```@Amanieu```
2025-12-10 07:54:20 +01:00