debuginfo: emit DW_TAG_call_site entries
Set `FlagAllCallsDescribed` on function definition DIEs so LLVM emits DW_TAG_call_site entries, letting debuggers and analysis tools track tail calls.
Add `-Zsanitize=kernel-hwaddress`
The Linux kernel has a config option called `CONFIG_KASAN_SW_TAGS` that enables `-fsanitize=kernel-hwaddress`. This is not supported by Rust.
One slightly awkward detail is that `#[sanitize(address = "off")]` applies to both `-Zsanitize=address` and `-Zsanitize=kernel-address`. Probably it was done this way because both are the same LLVM pass. I replicated this logic here for hwaddress, but it might be undesirable.
Note that `#[sanitize(kernel_hwaddress = "off")]` could be supported as an annotation on statics, but since it's also missing for `#[sanitize(hwaddress = "off")]`, I did not add it.
MCP: https://github.com/rust-lang/compiler-team/issues/975
Tracking issue: https://github.com/rust-lang/rust/issues/154171
cc @rcvalle @maurer @ojeda
refactor RangeFromIter overflow-checks impl
Crates with different overflow-checks settings accessing the same RangeFromIter resulted in incorrect values being yielded
Fixesrust-lang/rust#154124
r? @tgross35
[BPF] add target feature allows-misaligned-mem-access
This PR adds the allows-misaligned-mem-access target feature to the BPF target. The feature can enable misaligned memory access support in the LLVM backend, aligning Rust’s BPF target behavior with the corresponding LLVM update introduced in [llvm/llvm-project#167013](https://github.com/llvm/llvm-project/pull/167013) (included in LLVM 22).
tests: codegen-llvm: iter-repeat-n-trivial-drop: Allow non-zero lower bound to __rust_alloc size
LLVM emits a lower bound of 8 for the size parameter to `__rust_alloc` when targeting `x86_64-unknown-hermit`. Since that is also completely valid, relax the lower bound check.
I'm not really sure why LLVM is able to infer this - with the same setup targeting `x86_64-unknown-linux-gnu` I also see the lower bound of 0. Not that it's wrong, but I'd be curious to know which codegen options play into this.
num: Separate public API from internal implementations
Currently we have a single `core::num` module that contains both thin wrapper API and higher-complexity numeric routines. Restructure this by moving implementation details to a new `imp` module.
This results in a more clean separation of what is actually user-facing compared to items that have a stability attribute because they are public for testing.
The first commit does the actual change then the second moves a portion back.
Instead of defaulting to `None` it now defaults to `Align::ONE` i.e.
no alignment restriction. Codegen test changes are due to us now skipping
`align 1` annotations (they are useless; not skipping them makes all the
raw pointers gain an `align 1` annotation which doesn't seem any good)
Properly pass offload sizes to kernel args
This PRs prevents offload from creating an unnecessary alloca when all the arg sizes are static.
I'll implement the first dynamic-size data type in a follow up PR (slice support).
r? @ZuseZ4
fix autodiff parsing for non-trait impl
fixes: https://github.com/rust-lang/rust/issues/153322
@Sa4dUs Looks like we missed a case.
But also, going through the code, line 455 seems suspicious to me:
`Annotatable::AssocItem(d_fn, Impl { of_trait: false })`
Are we sure that this should always be an Impl, and never an impl of a trait?
r? @oli-obk
Update call-llvm-intrinsics test for Rust 1.94.0 IR
Rust 1.94 now passes constants directly to llvm.sqrt.f32 instead of
storing/loading via the stack.
- Updated the FileCheck pattern to match the new IR:
// CHECK: call float @llvm.sqrt.f32(float 4.000000e+00)
The test intent is unchanged: it still ensures the intrinsic is
emitted as a 'call' (not 'invoke').
- Removed unnecessary local variables and Drop usage to work in
`#![no_core]` mode with minicore.
- Added required crate attributes:
#![feature(no_core, lang_items)]
#![no_std]
#![no_core]
- Replaced `//@ only-riscv64` (host-based execution) with explicit
revisions for:
riscv32gc-unknown-linux-gnu
riscv64gc-unknown-linux-gnu
This ensures deterministic multi-target coverage in CI without
relying on the host architecture.
- Added `//@ needs-llvm-components: riscv` and
`//@ min-llvm-version: 21` for CI compatibility.
Fixesrust-lang/rust#153271
Rust 1.94 now passes constants directly to llvm.sqrt.f32 instead of
storing/loading via the stack.
- Updated the FileCheck pattern to match the new IR:
// CHECK: call float @llvm.sqrt.f32(float 4.000000e+00)
The test intent is unchanged: it still ensures the intrinsic is
emitted as a 'call' (not 'invoke').
- Removed unnecessary local variables and Drop usage to work in
`#![no_core]` mode with minicore.
- Added required crate attributes:
#![feature(no_core, lang_items)]
#![no_std]
#![no_core]
- Replaced `//@ only-riscv64` (host-based execution) with explicit
revisions for:
riscv32gc-unknown-linux-gnu
riscv64gc-unknown-linux-gnu
This ensures deterministic multi-target coverage in CI without
relying on the host architecture.
- Added `//@ needs-llvm-components: riscv` and
`//@ min-llvm-version: 21` for CI compatibility.
Signed-off-by: Deepesh Varatharajan <Deepesh.Varatharajan@windriver.com>
tests: codegen-llvm: vec-calloc: do not require the uwtable attribute
The `uwtable` attribute does not get emitted on targets that don't have unwinding tables, such as `x86_64-unknown-hermit`.
Updated slice tests to pass for big endian hosts for `multiple-option-or-permutations.rs`
It was discovered that the FileCheck tests when performing an `Option::or` operation on a slice was failing when tested on a big endian host.
The compiler explorer link is here outlining the codegen output differences - https://rust.godbolt.org/z/qdE7d3G4f
This MR relaxes the constraints for the `*slice_u8` variants of the test (by changing `CHECK-NEXT` to `CHECK-DAG`), whilst still maintaining the check for the necessary `or` logic.
Huge thanks to @Gelbpunkt for identifying this issue! It has been confirmed that this fix passes on a big endian target now as well.
Closesrust-lang/rust#151718
LLVM emits a lower bound of 8 for the size parameter to __rust_alloc
when targeting x86_64-unknown-hermit. Since that is also completely
valid, relax the lower bound check.
perf(codegen): Restore `noundef` On `PassMode::Cast` Args In Rust ABI
### Summary:
#### Problem:
Small aggregate arguments passed via `PassMode::Cast` in the Rust ABI (e.g. `[u32; 2]` cast to `i64`) are missing `noundef` in the emitted LLVM IR, even when the type contains no uninit bytes:
```rust
#[no_mangle]
pub fn f(v: [u32; 2]) -> u32 { v[0] }
```
```llvm
; expected: define i32 @f(i64 noundef %0)
; actual: define i32 @f(i64 %0) ← noundef missing
```
This blocks LLVM from applying optimizations that require value-defined semantics on function arguments.
#### Root Cause:
`adjust_for_rust_abi` calls `arg.cast_to(Reg::Integer)`, which internally creates a `CastTarget` with `ArgAttributes::new()` — always empty. Any validity attribute that was present before the cast is silently dropped.
This affects all `PassMode::Cast` arguments and return values in the Rust ABI: plain arrays, newtype wrappers, and any `BackendRepr::Memory` type small enough to fit in a register.
A prior attempt (rust-lang/rust#127210) used `Ty`/`repr` attributes to detect padding.
#### Solution:
After `adjust_for_rust_abi`, iterate all `PassMode::Cast` args and the return value. For each, call `layout_is_noundef` on the original layout; if it returns `true`, set `NoUndef` on the `CastTarget`'s `attrs`.
`layout_is_noundef` uses only the computed layout — `BackendRepr`, `FieldsShape`, `Variants`, `Scalar::is_uninit_valid()` — and never touches `Ty` or repr attributes. **Anything it cannot prove returns `false`.**
Covered cases:
- `Scalar` / `ScalarPair` (both halves initialized, fields contiguous)
- `FieldsShape::Array` (element type recursively uninit-free)
- `FieldsShape::Arbitrary` with `Variants::Single` (fields cover `0..size` with no gaps, each recursively uninit-free) — handles newtype wrappers, multi-field structs, single-variant enums, `repr(transparent)`, `repr(C)` wrappers
Conservatively excluded with FIXMEs:
- Multi-variant enums (per-variant padding analysis needed)
- Foreign-ABI casts (cast target may exceed layout size, needs a size guard)
### Changes:
- `compiler/rustc_ty_utils/src/abi.rs`: add restoration loop after `adjust_for_rust_abi`; add `layout_is_noundef` and `fields_cover_layout`.
- `tests/codegen-llvm/abi-noundef-cast.rs`: new FileCheck test covering arrays, newtype wrappers (`repr(Rust)`, `repr(transparent)`, `repr(C)`), multi-field structs, single-variant enums, return values, and negative cases (`MaybeUninit`, struct with trailing padding).
- `tests/codegen-llvm/debuginfo-dse.rs`: update one CHECK pattern — `Aggregate_4xi8` (`struct { i8, i8, i8, i8 }`) now correctly gets `noundef`.
Fixesrust-lang/rust#123183.
r? @RalfJung
`adjust_for_rust_abi` was casting small aggregates to an integer register
without propagating `noundef`, causing a performance regression (#123183)
— LLVM could no longer assume the bits were fully defined.
Add `layout_is_noundef` (conservative) + `fields_are_noundef` helper,
then use `cast_to_with_attrs` to forward `NoUndef` when proven:
Scalar → `!is_uninit_valid()`
ScalarPair → both scalars valid + `s1.size + s2.size == layout.size`
(size equality rejects layouts with inter-scalar padding)
Array → recurse into element; empty arrays unconditionally noundef
Arbitrary → `Variants::Single` required; walk fields in offset order —
any gap, non-noundef field, or trailing pad returns false
Union / Primitive / Simd → false (conservative)
Bless `pass-indirectly-attr.stderr` and `debuginfo-dse.rs` for the new
attribute on Cast args. Add `tests/codegen-llvm/abi-noundef-cast.rs`
covering positive Cast cases (arrays, plain structs, single-variant enum)
and negative Cast cases (MaybeUninit, multi-variant enum, field/pair gap,
trailing padding).
Fixes#123183.
Co-authored-by: Ralf Jung <post@ralfj.de>
Fix: On wasm targets, call `panic_in_cleanup` if panic occurs in cleanup
Previously this was not correctly implemented. Each funclet may need its own terminate block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which can have a terminate_block for each funclet. We key on the first basic block of the funclet -- in particular, this is the start block for the old case of the top level terminate function.
I also fixed the `terminate` handler to not be invoked when a foreign exception is raised, mimicking the behavior from msvc. On wasm, in order to avoid generating a `catch_all` we need to call `llvm.wasm.get.exception` and `llvm.wasm.get.ehselector`.
Previously this was not correctly implemented. Each funclet may need its own terminate
block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which
can have a terminate_block for each funclet. We key on the first basic block of the
funclet -- in particular, this is the start block for the old case of the top level
terminate function.
Rather than using a catchswitch/catchpad pair, I used a cleanuppad. The reason for the
pair is to avoid catching foreign exceptions on MSVC. On wasm, it seems that the
catchswitch/catchpad pair is optimized back into a single cleanuppad and a catch_all
instruction is emitted which will catch foreign exceptions. Because the new logic is
only used on wasm, it seemed better to take the simpler approach seeing as they do the
same thing.
extend unpin noalias tests to cover mutable references
https://github.com/rust-lang/rust/pull/152946 made a change to the logic for this attribute that the test should have flagged as problematic -- but the test only checked `Box`, not `&mut`, and those have independent code paths. So extend the test to also cover `&mut`.
@b-naber would be nice if you could confirm that the added tests do fail with your PR.
Simplify internals of `{Rc,Arc}::default`
This commit simplifies the internal implementation of `Default` for these two pointer types to have the same performance characteristics as before (a side effect of changes in rust-lang/rust#131460) while avoid use of internal private APIs of Rc/Arc. To preserve the same codegen as before some non-generic functions needed to be tagged as `#[inline]` as well, but otherwise the same IR is produced before/after this change.
The motivation of this commit is I was studying up on the state of initialization of `Arc` and `Rc` and figured it'd be nicer to reduce the use of internal APIs and instead use public stable APIs where possible, even in the implementation itself.