this enables packed-stack just as -mpacked-stack in clang and gcc.
packed-stack is needed on s390x for kernel development.
Co-authored-by: Ralf Jung <post@ralfj.de>
simd_fmin/fmax: make semantics and name consistent with scalar intrinsics
This is the SIMD version of https://github.com/rust-lang/rust/pull/153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.
In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.
In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (https://github.com/rust-lang/stdarch/issues/2056).
Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?
Also see https://github.com/rust-lang/rust/issues/153395.
stabilizes `core::range::RangeFrom`
stabilizes `core::range::RangeFromIter`
add examples for `remainder` method on range iterators
`RangeFromIter::remainder` was not stabilized (see issue 154458)
fromrangeiter-overflow-checks: accept optional `signext` for argument
On some targets such as LoongArch64 and RISCV64, the ABI requires sign-extension for 32-bit integer arguments, so LLVM may emit the `signext` attribute for the `%range` parameter. The existing CHECK pattern required the argument to be exactly `i32 noundef %range`, causing the test to fail on those targets.
Allow an optional `signext` attribute in the CHECK pattern so the test passes consistently across architectures without affecting the intended codegen validation.
debuginfo: emit DW_TAG_call_site entries
Set `FlagAllCallsDescribed` on function definition DIEs so LLVM emits DW_TAG_call_site entries, letting debuggers and analysis tools track tail calls.
Add `-Zsanitize=kernel-hwaddress`
The Linux kernel has a config option called `CONFIG_KASAN_SW_TAGS` that enables `-fsanitize=kernel-hwaddress`. This is not supported by Rust.
One slightly awkward detail is that `#[sanitize(address = "off")]` applies to both `-Zsanitize=address` and `-Zsanitize=kernel-address`. Probably it was done this way because both are the same LLVM pass. I replicated this logic here for hwaddress, but it might be undesirable.
Note that `#[sanitize(kernel_hwaddress = "off")]` could be supported as an annotation on statics, but since it's also missing for `#[sanitize(hwaddress = "off")]`, I did not add it.
MCP: https://github.com/rust-lang/compiler-team/issues/975
Tracking issue: https://github.com/rust-lang/rust/issues/154171
cc @rcvalle @maurer @ojeda
On some targets such as LoongArch64 and RISCV64, the ABI requires
sign-extension for 32-bit integer arguments, so LLVM may emit the
`signext` attribute for the `%range` parameter. The existing CHECK
pattern required the argument to be exactly `i32 noundef %range`,
causing the test to fail on those targets.
Allow an optional `signext` attribute in the CHECK pattern so the test
passes consistently across architectures without affecting the intended
codegen validation.
refactor RangeFromIter overflow-checks impl
Crates with different overflow-checks settings accessing the same RangeFromIter resulted in incorrect values being yielded
Fixesrust-lang/rust#154124
r? @tgross35
[BPF] add target feature allows-misaligned-mem-access
This PR adds the allows-misaligned-mem-access target feature to the BPF target. The feature can enable misaligned memory access support in the LLVM backend, aligning Rust’s BPF target behavior with the corresponding LLVM update introduced in [llvm/llvm-project#167013](https://github.com/llvm/llvm-project/pull/167013) (included in LLVM 22).
tests: codegen-llvm: iter-repeat-n-trivial-drop: Allow non-zero lower bound to __rust_alloc size
LLVM emits a lower bound of 8 for the size parameter to `__rust_alloc` when targeting `x86_64-unknown-hermit`. Since that is also completely valid, relax the lower bound check.
I'm not really sure why LLVM is able to infer this - with the same setup targeting `x86_64-unknown-linux-gnu` I also see the lower bound of 0. Not that it's wrong, but I'd be curious to know which codegen options play into this.
num: Separate public API from internal implementations
Currently we have a single `core::num` module that contains both thin wrapper API and higher-complexity numeric routines. Restructure this by moving implementation details to a new `imp` module.
This results in a more clean separation of what is actually user-facing compared to items that have a stability attribute because they are public for testing.
The first commit does the actual change then the second moves a portion back.
Instead of defaulting to `None` it now defaults to `Align::ONE` i.e.
no alignment restriction. Codegen test changes are due to us now skipping
`align 1` annotations (they are useless; not skipping them makes all the
raw pointers gain an `align 1` annotation which doesn't seem any good)
Properly pass offload sizes to kernel args
This PRs prevents offload from creating an unnecessary alloca when all the arg sizes are static.
I'll implement the first dynamic-size data type in a follow up PR (slice support).
r? @ZuseZ4
fix autodiff parsing for non-trait impl
fixes: https://github.com/rust-lang/rust/issues/153322
@Sa4dUs Looks like we missed a case.
But also, going through the code, line 455 seems suspicious to me:
`Annotatable::AssocItem(d_fn, Impl { of_trait: false })`
Are we sure that this should always be an Impl, and never an impl of a trait?
r? @oli-obk
Update call-llvm-intrinsics test for Rust 1.94.0 IR
Rust 1.94 now passes constants directly to llvm.sqrt.f32 instead of
storing/loading via the stack.
- Updated the FileCheck pattern to match the new IR:
// CHECK: call float @llvm.sqrt.f32(float 4.000000e+00)
The test intent is unchanged: it still ensures the intrinsic is
emitted as a 'call' (not 'invoke').
- Removed unnecessary local variables and Drop usage to work in
`#![no_core]` mode with minicore.
- Added required crate attributes:
#![feature(no_core, lang_items)]
#![no_std]
#![no_core]
- Replaced `//@ only-riscv64` (host-based execution) with explicit
revisions for:
riscv32gc-unknown-linux-gnu
riscv64gc-unknown-linux-gnu
This ensures deterministic multi-target coverage in CI without
relying on the host architecture.
- Added `//@ needs-llvm-components: riscv` and
`//@ min-llvm-version: 21` for CI compatibility.
Fixesrust-lang/rust#153271
Rust 1.94 now passes constants directly to llvm.sqrt.f32 instead of
storing/loading via the stack.
- Updated the FileCheck pattern to match the new IR:
// CHECK: call float @llvm.sqrt.f32(float 4.000000e+00)
The test intent is unchanged: it still ensures the intrinsic is
emitted as a 'call' (not 'invoke').
- Removed unnecessary local variables and Drop usage to work in
`#![no_core]` mode with minicore.
- Added required crate attributes:
#![feature(no_core, lang_items)]
#![no_std]
#![no_core]
- Replaced `//@ only-riscv64` (host-based execution) with explicit
revisions for:
riscv32gc-unknown-linux-gnu
riscv64gc-unknown-linux-gnu
This ensures deterministic multi-target coverage in CI without
relying on the host architecture.
- Added `//@ needs-llvm-components: riscv` and
`//@ min-llvm-version: 21` for CI compatibility.
Signed-off-by: Deepesh Varatharajan <Deepesh.Varatharajan@windriver.com>
tests: codegen-llvm: vec-calloc: do not require the uwtable attribute
The `uwtable` attribute does not get emitted on targets that don't have unwinding tables, such as `x86_64-unknown-hermit`.
Updated slice tests to pass for big endian hosts for `multiple-option-or-permutations.rs`
It was discovered that the FileCheck tests when performing an `Option::or` operation on a slice was failing when tested on a big endian host.
The compiler explorer link is here outlining the codegen output differences - https://rust.godbolt.org/z/qdE7d3G4f
This MR relaxes the constraints for the `*slice_u8` variants of the test (by changing `CHECK-NEXT` to `CHECK-DAG`), whilst still maintaining the check for the necessary `or` logic.
Huge thanks to @Gelbpunkt for identifying this issue! It has been confirmed that this fix passes on a big endian target now as well.
Closesrust-lang/rust#151718
LLVM emits a lower bound of 8 for the size parameter to __rust_alloc
when targeting x86_64-unknown-hermit. Since that is also completely
valid, relax the lower bound check.
perf(codegen): Restore `noundef` On `PassMode::Cast` Args In Rust ABI
### Summary:
#### Problem:
Small aggregate arguments passed via `PassMode::Cast` in the Rust ABI (e.g. `[u32; 2]` cast to `i64`) are missing `noundef` in the emitted LLVM IR, even when the type contains no uninit bytes:
```rust
#[no_mangle]
pub fn f(v: [u32; 2]) -> u32 { v[0] }
```
```llvm
; expected: define i32 @f(i64 noundef %0)
; actual: define i32 @f(i64 %0) ← noundef missing
```
This blocks LLVM from applying optimizations that require value-defined semantics on function arguments.
#### Root Cause:
`adjust_for_rust_abi` calls `arg.cast_to(Reg::Integer)`, which internally creates a `CastTarget` with `ArgAttributes::new()` — always empty. Any validity attribute that was present before the cast is silently dropped.
This affects all `PassMode::Cast` arguments and return values in the Rust ABI: plain arrays, newtype wrappers, and any `BackendRepr::Memory` type small enough to fit in a register.
A prior attempt (rust-lang/rust#127210) used `Ty`/`repr` attributes to detect padding.
#### Solution:
After `adjust_for_rust_abi`, iterate all `PassMode::Cast` args and the return value. For each, call `layout_is_noundef` on the original layout; if it returns `true`, set `NoUndef` on the `CastTarget`'s `attrs`.
`layout_is_noundef` uses only the computed layout — `BackendRepr`, `FieldsShape`, `Variants`, `Scalar::is_uninit_valid()` — and never touches `Ty` or repr attributes. **Anything it cannot prove returns `false`.**
Covered cases:
- `Scalar` / `ScalarPair` (both halves initialized, fields contiguous)
- `FieldsShape::Array` (element type recursively uninit-free)
- `FieldsShape::Arbitrary` with `Variants::Single` (fields cover `0..size` with no gaps, each recursively uninit-free) — handles newtype wrappers, multi-field structs, single-variant enums, `repr(transparent)`, `repr(C)` wrappers
Conservatively excluded with FIXMEs:
- Multi-variant enums (per-variant padding analysis needed)
- Foreign-ABI casts (cast target may exceed layout size, needs a size guard)
### Changes:
- `compiler/rustc_ty_utils/src/abi.rs`: add restoration loop after `adjust_for_rust_abi`; add `layout_is_noundef` and `fields_cover_layout`.
- `tests/codegen-llvm/abi-noundef-cast.rs`: new FileCheck test covering arrays, newtype wrappers (`repr(Rust)`, `repr(transparent)`, `repr(C)`), multi-field structs, single-variant enums, return values, and negative cases (`MaybeUninit`, struct with trailing padding).
- `tests/codegen-llvm/debuginfo-dse.rs`: update one CHECK pattern — `Aggregate_4xi8` (`struct { i8, i8, i8, i8 }`) now correctly gets `noundef`.
Fixesrust-lang/rust#123183.
r? @RalfJung