Commit Graph

506 Commits

Author SHA1 Message Date
Ralf Jung c7220f423b rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering 2026-03-15 14:53:00 +01:00
bjorn3 474a7168ab Remove explicit EmitThinLTOSummary argument
In favor of passing a NULL ThinLTOSummaryBufferRef. And improve type
improve type safety on the Rust side.
2026-02-21 11:47:45 +00:00
bjorn3 a086b3617e Remove ModuleBuffer ThinBuffer duplication 2026-02-21 11:47:45 +00:00
bjorn3 a5372d1dba Replace LLVMRustThinLTOBuffer with separate LLVMRustBuffers for bitcode and summary 2026-02-21 11:47:45 +00:00
bjorn3 8b2c10ff82 Replace LLVMRustModuleBuffer with generic LLVMRustBuffer 2026-02-21 11:47:45 +00:00
bjorn3 6366a698e3 Remove -Zemit-thin-lto flag
As far as I can tell it was introduced to allow fat LTO with
-Clinker-plugin-lto. Later a change was made to automatically disable
ThinLTO summary generation when -Clinker-plugin-lto -Clto=fat is used,
so we can safely remove it.
2026-02-20 12:19:41 +00:00
Jana Dönszelmann 006588a3ae codegen-llvm shim function for function aliases for macos 2026-02-17 14:22:52 +01:00
bors 75963ce795 Auto merge of #151065 - nagisa:add-preserve-none-abi, r=petrochenkov
abi: add a rust-preserve-none calling convention

This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.

For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
2026-01-25 02:49:32 +00:00
Simonas Kazlauskas 6db94dbc25 abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.

For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:

    foo:
        push r12
        ; do things
        pop r12
        jmp next_step

This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.

I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
2026-01-24 19:23:17 +02:00
Jonathan Brouwer dec8d6ebcf Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726
Add -Z large-data-threshold

This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
2026-01-23 11:07:55 +01:00
Nikita Popov 0be66603ac Avoid passing addrspacecast to lifetime intrinsics
Since LLVM 22 the alloca must be passed directly. Do this by
stripping the addrspacecast if it exists.
2026-01-20 14:47:04 +01:00
Stuart Cook 1262ff906b Rollup merge of #150288 - offload-bench-fix, r=ZuseZ4
Add scalar support for offload

This PR adds scalar support to the offload feature. The scalar management has two main parts:

On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that.
On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type.

r? @ZuseZ4
2026-01-20 18:00:08 +11:00
Marcelo Domínguez 307a4fcdf8 Add scalar support for both host and device 2026-01-19 22:28:42 +01:00
Farid Zakaria 93f2e80f4a Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data
in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses
RIP-relative addressing (32-bit offsets), while larger data uses
absolute 64-bit addressing. This allows the compiler to generate more
efficient code for smaller data while still supporting data larger than
2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang.
The default threshold is 65536 bytes (64KB) if not specified, matching
LLVM's default behavior.
2026-01-07 11:57:48 -08:00
sgasho 14ac6a1d3a Modified to output error messages appropriate to the situation 2026-01-07 00:33:57 +09:00
Jonathan Brouwer d898dccc21 Rollup merge of #150511 - Sa4dUs:offload-inline, r=ZuseZ4
Allow inline calls to offload intrinsic

Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block.

r? `@ZuseZ4`

fixes: https://github.com/rust-lang/rust/issues/150413
2025-12-31 14:30:48 +01:00
Marcelo Domínguez 9d8b4cc70d Restore builder at the end of saved bb 2025-12-31 13:10:29 +01:00
dianqk fe075ad212 Removes the serde dependency in rustc_codegen_llvm 2025-12-28 15:52:20 +08:00
Manuel Drehwald dfef2e96fe Remove the need to call clang for std::offload usages 2025-12-23 05:20:07 -08:00
bors 5c53093374 Auto merge of #150133 - ZuseZ4:enzyme-frontend-nightly, r=jieyouxu
remove llvm_enzyme and enzyme fallbacks from most places

Using dlopen to get symbols has the nice benefit that rustc itself doesn't depend on libenzyme symbols anymore. We can therefore delete most fallback implementations in the backend (independently of whether we enable enzyme or not). When trying to use autodiff on nightly, we will now fail with a nice error if and only if we fail to load libEnzyme-21.so in our backend.

Verified:
Build as nightly, without Enzyme
Build as nightly, with Enzyme
Build as stable (without Enzyme)

With this PR we will now run `tests/ui/autodiff` on nightly, the tests are passing.

r? `@kobzol`
2025-12-23 02:49:04 +00:00
Manuel Drehwald c34ea6e56d remove llvm_enzyme and enzyme fallbacks from most places, enable the autodiff frontend on nightly 2025-12-19 11:02:57 -08:00
Zalathar 735a980693 Move DIBuilderBox out of ffi.rs 2025-12-19 12:32:49 +11:00
Zalathar c2f8ee9bba Remove inherent methods from llvm::TypeKind 2025-12-19 12:32:49 +11:00
Jacob Pratt 641100c391 Rollup merge of #150060 - ZuseZ4:autodiff-dlopen-ice, r=Kobzol
autodiff: emit an error if we fail to find libEnzyme

Tested manually by moving libEnzyme-21.so away. We should adjust the error msg. once we have the component up.

It's the first usage within rustc of this experimental feature, but afaik we're open to dogfooding those for test purpose, right?

r? ``@Kobzol``
2025-12-16 23:10:10 -05:00
Manuel Drehwald 793d990d11 Emit a proper error if we fail to find libEnzyme 2025-12-16 21:33:28 +01:00
David Wood a56b1b9283 codegen: implement repr(scalable)
Introduces `BackendRepr::ScalableVector` corresponding to scalable
vector types annotated with `repr(scalable)` which lowers to a scalable
vector type in LLVM.

Co-authored-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>
2025-12-16 11:00:12 +00:00
sgasho ddd5aad8a3 feat: dlopen Enzyme 2025-12-16 00:31:32 +09:00
Jana Dönszelmann 9bd6b7ff72 EII: generate aliases for implementations 2025-12-12 11:32:29 +01:00
Alina Sbirlea ad73972e99 Fix for LLVM22 making lowering decisions dependent on RuntimeLibraryInfo.
LLVM reference commit:
https://github.com/llvm/llvm-project/commit/04c81a99735c04b2018eeb687e74f9860e1d0e1b.
2025-12-04 20:23:00 +00:00
Stuart Cook 2b150f2c65 Rollup merge of #147936 - Sa4dUs:offload-intrinsic, r=ZuseZ4
Offload intrinsic

This PR implements the minimal mechanisms required to run a small subset of arbitrary offload kernels without relying on hardcoded names or metadata.

- `offload(kernel, (..args))`: an intrinsic that generates the necessary host-side LLVM-IR code.
- `rustc_offload_kernel`: a builtin attribute that marks device kernels to be handled appropriately.

Example usage (pseudocode):
```rust
fn kernel(x: *mut [f64; 128]) {
    core::intrinsics::offload(kernel_1, (x,))
}

#[cfg(target_os = "linux")]
extern "C" {
    pub fn kernel_1(array_b: *mut [f64; 128]);
}

#[cfg(not(target_os = "linux"))]
#[rustc_offload_kernel]
extern "gpu-kernel" fn kernel_1(x: *mut [f64; 128]) {
    unsafe { (*x)[0] = 21.0 };
}
```
2025-11-26 23:32:03 +11:00
Marcelo Domínguez 5128ce10a0 Implement offload intrinsic 2025-11-25 20:04:27 +01:00
Manuel Drehwald 5fbe5dae42 Only try to link against offload functions if llvm.enzyme is enabled 2025-11-23 00:19:53 -08:00
Manuel Drehwald 89d50591c0 Replace the first of 4 binary invocations for offload 2025-11-21 02:41:17 -08:00
Quinn Okabayashi c7e50d0f37 Remove unused LLVMModuleRef argument 2025-11-12 15:46:08 +00:00
bors 87f9dcd5e2 Auto merge of #147935 - luca3s:add-rtsan, r=petrochenkov
Add LLVM realtime sanitizer

This is a new attempt at adding the [LLVM real-time sanitizer](https://clang.llvm.org/docs/RealtimeSanitizer.html) to rust.

Previously this was attempted in https://github.com/rust-lang/rfcs/pull/3766.

Since then the `sanitize` attribute was introduced in https://github.com/rust-lang/rust/pull/142681 and it is a lot more flexible than the old `no_santize` attribute. This allows adding real-time sanitizer without the need for a new attribute, like it was proposed in the RFC. Because i only add a new value to a existing command line flag and to a attribute i don't think an MCP is necessary.

Currently real-time santizer is usable in rust code with the [rtsan-standalone](https://crates.io/crates/rtsan-standalone) crate. This downloads or builds the sanitizer runtime and then links it into the rust binary.

The first commit adds support for more detailed sanitizer information.
The second commit then actually adds real-time sanitizer.
The third adds a warning against using real-time sanitizer with async functions, cloures and blocks because it doesn't behave as expected when used with async functions. I am not sure if this is actually wanted, so i kept it in a seperate commit.
The fourth commit adds the documentation for real-time sanitizer.
2025-11-08 12:24:15 +00:00
Lucas Baumann d198633b95 add realtime sanitizer 2025-11-06 13:20:12 +01:00
Manuel Drehwald 360b38cceb Fix device code generation, to account for an implicit dyn_ptr argument. 2025-11-06 03:34:38 -05:00
Matthias Krüger 3d671c0d54 Rollup merge of #148103 - Zalathar:compression, r=wesleywiser
cg_llvm: Pass `debuginfo_compression` through FFI as an enum

There are only three possible values, making an enum more appropriate.

This avoids string allocation on the Rust side, and avoids ad-hoc `!strcmp` to convert back to an enum on the C++ side.
2025-10-31 18:41:51 +01:00
Tomasz Miąsko 2a03a948b9 Deduce captures(none) for a return place and parameters
Extend attribute deduction to determine whether parameters using
indirect pass mode might have their address captured. Similarly to
the deduction of `readonly` attribute this information facilitates
memcpy optimizations.
2025-10-25 22:53:52 +02:00
Zalathar 73b734bf63 Pass debuginfo_compression through FFI as an enum 2025-10-25 23:58:19 +11:00
bors 96fe3c31c2 Auto merge of #147022 - Zalathar:no-args, r=wesleywiser
Remove current code for embedding command-line args in PDB

The compiler currently has code that will obtain a list of quoted command-line arguments, and pass it through to TargetMachine creation, so that the command-line args can be embedded in PDB output.

This PR removes that code, due to subtle concerns that might not have been apparent when it was originally added.

---

Those concerns include:
- The entire command-line quoting process is repeated every time a target-machine-factory is created. In incremental builds this typically occurs 500+ times, instead of happening only once. The repeated quoting constitutes a large chunk of instructions executed in the `large-workspace` benchmark.
  - See https://github.com/rust-lang/rust/pull/146804#issuecomment-3317322958 for an example of the perf consequences of skipping all that work.
  - This overhead occurs even when building for targets or configurations that don't emit PDB output.
- Command-line arguments are obtained in a way that completely bypasses the query system, which is a problem for the integrity of incremental compilation.
  - Fixing this alone is likely to inhibit incremental rebuilds for most or all CGUs, even in builds that don't emit PDB output.
- Command-line arguments and the executable path are obtained in a way that completely bypasses the compiler's path-remapping system, which is a reproducibility hazard.
  - https://github.com/rust-lang/rust/issues/128842

---

Relevant PRs:
- https://github.com/rust-lang/rust/pull/113492
- https://github.com/rust-lang/rust/pull/130446
- https://github.com/rust-lang/rust/pull/131805
- https://github.com/rust-lang/rust/pull/146700
- https://github.com/rust-lang/rust/pull/146973

Zulip thread:
- https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/Some.20PDB.20info.20bypasses.20the.20query.20system.20and.20path.20remapping/with/541432211

---

According to rust-lang/rust#96475, one of the big motivations for embedding the command-line arguments was to enable tools like Live++. [It appears that Live++ doesn't actually support Rust yet](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/embeded.20compiler.20args.20and.20--remap-path-prefix/near/523800010), so it's possible that there aren't any existing workflows for this removal to break.

In the future, there could be a case for reintroducing some or all of this functionality, guarded behind an opt-in flag so that it doesn't cause problems for other users. But as it stands, the current implementation puts a disproportionate burden on other users and on compiler maintainers.
2025-10-22 00:21:08 +00:00
bors fd847d4d5d Auto merge of #142696 - ZuseZ4:offload-device1, r=oli-obk
Offload host2

r? `@oli-obk`

A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html

I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.
2025-10-20 10:17:29 +00:00
Manuel Drehwald b56d555a36 fix host code 2025-10-19 09:28:39 -07:00
Zalathar 98c95c966b Remove current code for embedding command-line args in PDB 2025-10-18 12:24:40 +11:00
Guillaume Gomez 3938f42bb1 Rollup merge of #147608 - Zalathar:debuginfo, r=nnethercote
cg_llvm: Use `LLVMDIBuilderCreateGlobalVariableExpression`

- Part of rust-lang/rust#134001
- Follow-up to rust-lang/rust#146763

---

This PR dismantles the somewhat complicated `LLVMRustDIBuilderCreateStaticVariable` function, and replaces it with equivalent calls to `LLVMDIBuilderCreateGlobalVariableExpression` and `LLVMGlobalSetMetadata`.

A key difference is that the new code does not replicate the attempted downcast of `InitVal`. As far as I can tell, those downcasts were actually dead, because `llvm::ConstantInt` and `llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`. I tried replacing those code paths with fatal errors, and was unable to induce failure in any of the relevant test suites I ran.

I have also confirmed that if the calls to `create_static_variable` are commented out, debuginfo tests will fail, demonstrating some amount of relevant test coverage.

The new `DIBuilder` methods have been added via an extension trait, not as inherent methods, to avoid impeding rust-lang/rust#142897.
2025-10-13 11:25:23 +02:00
Zalathar 1081d98551 Use LLVMDIBuilderCreateGlobalVariableExpression
Note that the code in `LLVMRustDIBuilderCreateStaticVariable` that tried to
downcast `InitVal` appears to have been dead, because `llvm::ConstantInt` and
`llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`.
2025-10-12 23:36:26 +11:00
AMS21 0abecda9ed Replace LLVMRustContextCreate with normal LLVM-C API calls
Since `LLVMRustContextCreate` can easily be replaced with a call to
`LLVMContextCreate` and `LLVMContextSetDiscardValueNames`.
2025-10-10 15:45:40 +02:00
bors 4b57d8154a Auto merge of #147519 - Zalathar:rollup-o5f16uo, r=Zalathar
Rollup of 3 pull requests

Successful merges:

 - rust-lang/rust#147446 (PassWrapper: use non-deprecated lookupTarget method)
 - rust-lang/rust#147473 (Do `x check` on various bootstrap tools in CI)
 - rust-lang/rust#147509 (remove intrinsic wrapper functions from LLVM bindings)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-10-09 10:54:43 +00:00
Stuart Cook 4dfd977c8b Rollup merge of #147488 - AMS21:remove_llvm_rust_insert_private_global, r=nikic
refactor: Remove `LLVMRustInsertPrivateGlobal` and `define_private_global`

Since it can easily be implemented using the existing LLVM C API in
terms of `LLVMAddGlobal` and `LLVMSetLinkage` and `define_private_global`
was only used in one place.

Work towards https://github.com/rust-lang/rust/issues/46437
2025-10-09 18:43:26 +11:00
AMS21 064e3b8212 remove intrinsic wrapper functions from LLVM bindings 2025-10-09 09:26:44 +02:00