Commit Graph

3257 Commits

Author SHA1 Message Date
Jacob Pratt f065c9dab1 Rollup merge of #152132 - folkertdev:carryless-mul, r=Mark-Simulacrum
implement `carryless_mul`

tracking issue: https://github.com/rust-lang/rust/issues/152080
ACP: https://github.com/rust-lang/libs-team/issues/738

This defers to LLVM's `llvm.clmul` when available, and otherwise falls back to a method from the `polyval` crate ([link](https://github.com/RustCrypto/universal-hashes/blob/master/polyval/src/field_element/soft/soft64.rs)).

Some things are missing, which I think we can defer:

- the ACP has some discussion about additional methods, but I'm not sure exactly what is wanted or how to implement it efficiently
- the SIMD intrinsic is not yet `const` (I think I ran into a bootstrapping issue). That is fine for now, I think in `stdarch` we can't really use this intrinsic at the moment, we'd only want the scalar version to replace some riscv intrinsics.
- the SIMD intrinsic is not implemented for the gcc and cranelift backends. That should be reasonably straightforward once we have a const eval implementation though.
2026-02-14 23:17:31 -05:00
Folkert de Vries b935f379b4 implement carryless_mul 2026-02-14 21:23:30 +01:00
Jonathan Brouwer 018a5efcf7 Rename inline_fluent! to msg! 2026-02-14 13:47:52 +01:00
Jacob Pratt 61dee57441 Rollup merge of #152594 - folkertdev:va-list-wasm64, r=alexcrichton
c-variadic: implement `va_arg` for `wasm64`

tracking issue: https://github.com/rust-lang/rust/issues/44930

A LLVM maintainer of the wasm backend confirmed that the behavior on wasm64 is intented: the slot size is 4 on both wasm32 and wasm64. https://github.com/llvm/llvm-project/pull/173580#issuecomment-3900118960

r? workingjubilee
cc @alexcrichton as target maintainer
2026-02-13 22:26:34 -05:00
Jacob Pratt 202f102914 Rollup merge of #152573 - usamoi:escape-2, r=bjorn3
move `escape_symbol_name` to `cg_ssa`

followup of rust-lang/rust#151955

r? @bjorn3
2026-02-13 22:26:33 -05:00
Jacob Pratt 4571317d32 Rollup merge of #151998 - zmodem:naked_visibility.squash, r=saethlin,bjorn3
Set hidden visibility on naked functions in compiler-builtins

88b46460fa made builtin functions hidden, but it doesn't apply to naked functions, which are generated through a different code path.

This was discovered in rust-lang/rust#151486 where aarch64 outline atomics showed up in shared objects, overriding the symbols from compiler-rt.
2026-02-13 22:26:31 -05:00
Folkert de Vries cc9e8c9716 c-variadic: implement va_arg for wasm64 2026-02-14 00:51:10 +01:00
Jonathan Brouwer fd598ebac8 Rollup merge of #152440 - him2him2:fix-typos-compiler, r=Kivooeo,Kobzol
Fix typos and grammar in compiler and build documentation

Fix contextual typos and grammar issues in compiler and build docs that automated spellcheckers miss:

- `compiler/rustc_codegen_llvm/src/debuginfo/doc.md`: fix 5 issues (adaption → adaptation, "allow to" → "allow V-ing" x3, later → latter + plural)
- `compiler/rustc_thread_pool/src/sleep/README.md`: fix 2 issues (idle → active per context, then → than)
- `src/bootstrap/README.md`: fix 2 issues (rust → Rust, add missing article "the")
- `src/ci/docker/README.md`: fix 2 issues ("come form" → "come from", grammar)

No functional changes.
2026-02-13 13:35:02 +01:00
usamoi 4796ff1bf5 move escape_symbol_name to cg_ssa 2026-02-13 20:31:18 +08:00
Stuart Cook 8b036c7b72 Rollup merge of #152486 - fneddy:s390x_simplify_backchain, r=dingxiangfei2009
remove redundant backchain attribute in codegen

llvm will look at both
1. the values of `"target-features"` and
2. the function string attributes.

this patch removes the redundant function string attribute because it is not needed at all. rustc sets the `+backchain` attribute through `target_features_attr(...)`
https://github.com/rust-lang/rust/blob/d34f1f931489618efffc4007e6b6bdb9e10f6467/compiler/rustc_codegen_llvm/src/attributes.rs#L590
https://github.com/rust-lang/rust/blob/d34f1f931489618efffc4007e6b6bdb9e10f6467/compiler/rustc_codegen_llvm/src/attributes.rs#L326-L337
2026-02-13 15:19:13 +11:00
Stuart Cook 1378e9efeb Rollup merge of #152528 - bjorn3:lto_refactors11, r=petrochenkov
Support serializing CodegenContext

Follow up to https://github.com/rust-lang/rust/pull/149209
Part of https://github.com/rust-lang/compiler-team/issues/908
2026-02-13 15:19:10 +11:00
bors bb8b30a5fc Auto merge of #148537 - oli-obk:push-yxuttqrqqyvu, r=dianqk
Start using pattern types in libcore (NonZero and friends)

part of rust-lang/rust#136006 

This PR only changes the internal representation of `NonZero`, `NonMax`, ... and other integral range types in libcore. This subsequently affects other types made up of it, but nothing really changes except that the field of `NonZero` is now accessible safely in contrast to the `rustc_layout_scalar_range_start` attribute, which has all kinds of obscure rules on how to properly access its field.
2026-02-12 13:23:22 +00:00
bjorn3 506ed6dcb7 Remove SelfProfilerRef from CodegenContext
It can't be serialized to a file.
2026-02-12 12:44:14 +00:00
Eddy (Eduard) Stefes c6f57becfc remove redundant backchain attribute in codegen
llvm will look at both
1. the values of "target-features" and
2. the function string attributes.

this removes the redundant function string attribute because it is not needed at all.
rustc sets the `+backchain` attribute through `target_features_attr(...)`
2026-02-12 09:58:25 +01:00
bors 7ad4e69ad5 Auto merge of #152517 - jhpratt:rollup-fGRcId6, r=jhpratt
Rollup of 17 pull requests

Successful merges:

 - rust-lang/rust#142415 (Add note when inherent impl for a alias type defined outside of the crate)
 - rust-lang/rust#142680 (Fix passing/returning structs with the 64-bit SPARC ABI)
 - rust-lang/rust#150768 (Don't compute FnAbi for LLVM intrinsics in backends)
 - rust-lang/rust#151152 (Add FCW for derive helper attributes that will conflict with built-in attributes)
 - rust-lang/rust#151814 (layout: handle rigid aliases without params)
 - rust-lang/rust#151863 (Borrowck: simplify diagnostics for placeholders)
 - rust-lang/rust#152159 (Add note for `?Sized` params in int-ptr casts diag)
 - rust-lang/rust#152434 (Clarify names of `QueryVTable` functions for "executing" a query)
 - rust-lang/rust#152478 (Remove tm_factory field from CodegenContext)
 - rust-lang/rust#152498 (Partially revert "resolve: Update `NameBindingData::vis` in place")
 - rust-lang/rust#152316 (fix: add continue)
 - rust-lang/rust#152394 (Correctly check if a macro call is actually a macro call in rustdoc highlighter)
 - rust-lang/rust#152425 (Port #![test_runner] to the attribute parser)
 - rust-lang/rust#152481 (Use cg_ssa's produce_final_output_artifacts in cg_clif)
 - rust-lang/rust#152485 (fix issue#152482)
 - rust-lang/rust#152495 (Clean up some subdiagnostics)
 - rust-lang/rust#152502 (Implement `BinaryHeap::from_raw_vec`)
2026-02-12 06:57:59 +00:00
Jacob Pratt c41f9343e9 Rollup merge of #152478 - bjorn3:lto_refactors10, r=wesleywiser
Remove tm_factory field from CodegenContext

This is necessary to support serializing the `CodegenContext` to a `.rlink` file in the future for moving LTO to the `-Zlink-only` step.

Follow up to https://github.com/rust-lang/rust/pull/149209
Part of https://github.com/rust-lang/compiler-team/issues/908
2026-02-12 00:41:08 -05:00
Jacob Pratt 0041f221ea Rollup merge of #150768 - bjorn3:llvm_intrinsic_no_fn_abi, r=wesleywiser
Don't compute FnAbi for LLVM intrinsics in backends

~~This removes support for `extern "unadjusted"` for anything other than LLVM intrinsics. It only makes sense in the context of calling LLVM intrinsics anyway as it exposes the way the LLVM backend internally represents types. Perhaps it should be renamed to `extern "llvm-intrinsic"`?~~

Follow up to https://github.com/rust-lang/rust/pull/148533
2026-02-12 00:41:06 -05:00
Lukas Bergdoll 2f3b952349 Stabilize assert_matches 2026-02-11 14:13:44 +01:00
bjorn3 f49223c443 Remove tm_factory field from CodegenContext
This is necessary to support serializing the CodegenContext to a .rlink
file in the future for moving LTO to the -Zlink-only step.
2026-02-11 12:18:04 +00:00
bjorn3 2d07e81a5c Move target machine factory error reporting into codegen backend 2026-02-11 10:53:38 +00:00
bjorn3 70587ce07c Remove a couple of unused errors 2026-02-11 10:39:48 +00:00
ron 11e4873b96 Fix typos and grammar in compiler and build documentation
- compiler/rustc_codegen_llvm/src/debuginfo/doc.md: fix 5 issues
  (adaption → adaptation, "allow to" → "allow V-ing" x3, later → latter + plural)
- compiler/rustc_thread_pool/src/sleep/README.md: fix 2 issues
  (idle → active per context, then → than)
- src/bootstrap/README.md: fix 2 issues (rust → Rust, add missing article "the")
- src/ci/docker/README.md: fix 2 issues ("come form" → "come from", grammar)
2026-02-10 10:22:05 -05:00
Oli Scherer 85e8282fab Start using pattern types in libcore 2026-02-10 11:19:24 +00:00
Jonathan Brouwer 16c7ee5c05 Rollup merge of #151640 - ZuseZ4:cleanup-datatransfer, r=nnethercote
Cleanup offload datatransfer

There are 3 steps to run code on a GPU: Copy data from the host to the device, launch the kernel, and move it back.
At the moment, we have a single variable describing the memory handling to do in each step, but that makes it hard for LLVM's opt pass to understand what's going on. We therefore split it into three variables, each only including the bits relevant for the corresponding stage.

cc @jdoerfert @kevinsala

r? compiler
2026-02-08 19:15:26 +01:00
Manuel Drehwald 6de0591c0b Split ol mapper into more specific to/kernel/from mapper and move init_all_rtls into global ctor 2026-02-07 17:34:39 -08:00
Jonathan Brouwer edd43c9e1f Fix existing messages in the diag structs 2026-02-07 09:11:34 +01:00
许杰友 Jieyou Xu (Joe) 4a1e87c598 Rollup merge of #151955 - usamoi:escape, r=davidtwco
escape symbol names in global asm

copied from https://github.com/llvm/llvm-project/blob/main/llvm/lib/MC/MCSymbol.cpp

closes https://github.com/rust-lang/rust/issues/151950
2026-02-06 10:25:44 +08:00
Jonathan Brouwer 82b5849618 Rollup merge of #150831 - folkertdev:more-va-arg-2, r=workingjubilee
c-variadic: make `va_arg` match on `Arch` exhaustive

tracking issue: https://github.com/rust-lang/rust/issues/44930

Continuing from https://github.com/rust-lang/rust/pull/150094, the more annoying cases remain. These are mostly very niche targets without Clang `va_arg` implementations, and so it might just be easier to defer to LLVM instead of us getting the ABI subtly wrong. That does mean we cannot stabilize c-variadic on those targets I think.

Alternatively we could ask target maintainers to contribute an implementation. I'd honestly prefer they make that change to LVM though (likely by just using `CodeGen::emitVoidPtrVAArg`) that we can mirror.

r? @workingjubilee
2026-02-05 12:16:56 +01:00
Jonathan Brouwer b66ead827c Rollup merge of #152020 - Sa4dUs:offload-remove-dummy-loads, r=ZuseZ4
Remove dummy loads on offload codegen

The current logic generates two dummy loads to prevent some globals from being optimized away. This blocks memtransfer loop hoisting optimizations, so it's time to remove them.

r? @ZuseZ4
2026-02-05 08:32:45 +01:00
Folkert de Vries d2b5ba2ff7 c-variadic: make va_arg match on Arch exhaustive 2026-02-05 00:41:10 +01:00
Folkert de Vries aa0ce237b4 c-variadic: minor cleanups of va_arg 2026-02-05 00:38:08 +01:00
bors db3e99bbab Auto merge of #150605 - RalfJung:fallback-intrinsic-skip, r=mati865
skip codegen for intrinsics with big fallback bodies if backend does not need them

This hopefully fixes the perf regression from https://github.com/rust-lang/rust/pull/148478. I only added the intrinsics with big fallback bodies to the list; it doesn't seem worth the effort of going through the entire list.

Fixes https://github.com/rust-lang/rust/issues/149945
Cc @scottmcm @bjorn3
2026-02-04 17:12:58 +00:00
Marcelo Domínguez 212c8c3811 Remove dummy loads 2026-02-04 15:26:56 +01:00
bjorn3 d2a0557afb Convert to inline diagnostics in all codegen backends 2026-02-04 13:12:49 +00:00
Hans Wennborg baca864856 Set hidden visibility for compiler-builtins statics too. 2026-02-04 13:54:32 +01:00
usamoi b5d9b7f683 escape symbol names in global asm 2026-02-02 15:10:36 +08:00
Stuart Cook 3d102a7812 Rollup merge of #150893 - ZuseZ4:move-un-register-lib, r=oli-obk
offload: move (un)register lib into global_ctors

Right now we initialize the openmp/offload runtime before every single offload call, and tear it down directly afterwards.
What we should rather do is initialize it once in the binary startup code, and tear it down at the end of the binary execution. Here I implement these changes.

Together, our generated IR has a lot less usage of globals, which in turn simplifies the refactoring in https://github.com/rust-lang/rust/pull/150683, where I introduce a new variant of our offload intrinsic.

r? oli-obk
2026-01-28 19:03:51 +11:00
Manuel Drehwald 1f11bf6649 Leave note to drop tgt_init_all_rtls in the future 2026-01-27 10:43:22 -08:00
Manuel Drehwald 7eae36f017 Add an early return if handling multiple offload calls 2026-01-27 10:43:03 -08:00
bors 873d4682c7 Auto merge of #151337 - the8472:bail-before-memcpy2, r=Mark-Simulacrum
optimize `vec.extend(slice.to_vec())`, take 2

Redoing https://github.com/rust-lang/rust/pull/130998
It was reverted in https://github.com/rust-lang/rust/pull/151150 due to flakiness. I have traced this to layout randomization perturbing the test (the failure reproduces locally with layout randomization), which is now excluded.
2026-01-25 19:45:35 +00:00
bors 75963ce795 Auto merge of #151065 - nagisa:add-preserve-none-abi, r=petrochenkov
abi: add a rust-preserve-none calling convention

This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.

For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
2026-01-25 02:49:32 +00:00
Matthias Krüger 3a69035338 Rollup merge of #151346 - folkertdev:simd-splat, r=workingjubilee
add `simd_splat` intrinsic

Add `simd_splat` which lowers to the LLVM canonical splat sequence.

```llvm
insertelement <N x elem> poison, elem %x, i32 0
shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer
```

Right now we try to fake it using one of

```rust
fn splat(x: u32) -> u32x8 {
    u32x8::from_array([x; 8])
}
```

or (in `stdarch`)

```rust
fn splat(value: $elem_type) -> $name {
    #[derive(Copy, Clone)]
    #[repr(simd)]
    struct JustOne([$elem_type; 1]);
    let one = JustOne([value]);
    // SAFETY: 0 is always in-bounds because we're shuffling
    // a simd type with exactly one element.
    unsafe { simd_shuffle!(one, one, [0; $len]) }
}
```

Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples:

- https://github.com/rust-lang/rust/issues/60637
- https://github.com/rust-lang/rust/issues/137407
- https://github.com/rust-lang/rust/issues/122623
- https://github.com/rust-lang/rust/issues/97804

---

As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends.

Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below.

Currently this just adds the intrinsic, it does not actually use it anywhere yet.
2026-01-24 21:04:15 +01:00
Simonas Kazlauskas 6db94dbc25 abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.

For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
2026-01-24 19:23:17 +02:00
Jonathan Brouwer 48b9a6c298 Rollup merge of #151527 - tgross35:f16-fixme-cleanup, r=folkertdev
Clean up or resolve cfg-related instances of `FIXME(f16_f128)`

* Replace target-specific config that has a `FIXME` with `cfg(target_has_reliable_f*)`
* Take care of trivial intrinsic-related FIXMEs
* Split `FIXME(f16_f128)` into `FIXME(f16)`, `FIXME(f128)`, or `FIXME(f16,f128)` to more clearly identify what they block

The individual commit messages have more details.
2026-01-23 11:07:57 +01:00
Jonathan Brouwer dec8d6ebcf Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726
Add -Z large-data-threshold

This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
2026-01-23 11:07:55 +01:00
Trevor Gross 490b307740 cleanup: Start splitting FIXME(f16_f128) into f16, f128, or f16,f128
Make it easier to identify which FIXMEs are blocking stabilization of
which type.
2026-01-22 23:41:57 -06:00
Jonathan Brouwer 704eaef9d4 Rollup merge of #151465 - RalfJung:fn-call-vars, r=mati865
codegen: clarify some variable names around function calls

I looked at rust-lang/rust#145932 to try to understand how it works, and quickly got lost in the variable names -- what refers to the caller, what to the callee? So here's my attempt at making those more clear. Hopefully the new names are correct.^^

Cc @JamieCunliffe
2026-01-22 13:35:42 +01:00
Jacob Pratt 512cc8d785 Rollup merge of #151442 - clubby789:crate-type-port, r=JonathanBrouwer
Port `#![crate_type]` to the attribute parser

Tracking issue: https://github.com/rust-lang/rust/issues/131229

~~Note that the actual parsing that is used in the compiler session is unchanged, as it must happen very early on; this just ports the validation logic.~~

Also added `// tidy-alphabetical-start` to `check_attr.rs` to make it a bit less conflict-prone
2026-01-22 00:37:43 -05:00
Jamie Hill-Daniel 66b78b700b Port crate_type to attribute parser 2026-01-22 02:34:28 +00:00
León Orell Valerian Liehr 558a59258e Support debuginfo for assoc const bindings 2026-01-21 18:52:08 +01:00