Commit Graph

458 Commits

Author SHA1 Message Date
Jonathan Brouwer dec8d6ebcf Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726
Add -Z large-data-threshold

This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
2026-01-23 11:07:55 +01:00
Jacob Pratt 2206d935f7 Rollup merge of #149209 - lto_refactors8, r=jackh726
Move LTO to OngoingCodegen::join

This will make it easier to in the future move all this code to link_binary.

Follow up to https://github.com/rust-lang/rust/pull/147810
Part of https://github.com/rust-lang/compiler-team/issues/908
2026-01-21 02:04:01 -05:00
Marcelo Domínguez 307a4fcdf8 Add scalar support for both host and device 2026-01-19 22:28:42 +01:00
Farid Zakaria 93f2e80f4a Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data
in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses
RIP-relative addressing (32-bit offsets), while larger data uses
absolute 64-bit addressing. This allows the compiler to generate more
efficient code for smaller data while still supporting data larger than
2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang.
The default threshold is 65536 bytes (64KB) if not specified, matching
LLVM's default behavior.
2026-01-07 11:57:48 -08:00
Wesley Wiser f74896fc01 Cleanup debuginfo_compression unstable flag
It isn't necessary to declare the option as a top-level flag when
it is accessible from `unstable_opts`.
2026-01-01 19:30:02 -06:00
dianqk fe075ad212 Removes the serde dependency in rustc_codegen_llvm 2025-12-28 15:52:20 +08:00
Manuel Drehwald 3fdc6da2aa adding proper error handling for offload 2025-12-23 05:20:11 -08:00
Manuel Drehwald dfef2e96fe Remove the need to call clang for std::offload usages 2025-12-23 05:20:07 -08:00
bors 5c53093374 Auto merge of #150133 - ZuseZ4:enzyme-frontend-nightly, r=jieyouxu
remove llvm_enzyme and enzyme fallbacks from most places

Using dlopen to get symbols has the nice benefit that rustc itself doesn't depend on libenzyme symbols anymore. We can therefore delete most fallback implementations in the backend (independently of whether we enable enzyme or not). When trying to use autodiff on nightly, we will now fail with a nice error if and only if we fail to load libEnzyme-21.so in our backend.

Verified:
Build as nightly, without Enzyme
Build as nightly, with Enzyme
Build as stable (without Enzyme)

With this PR we will now run `tests/ui/autodiff` on nightly, the tests are passing.

r? `@kobzol`
2025-12-23 02:49:04 +00:00
Matthias Krüger 508c382080 Rollup merge of #149788 - Sa4dUs:offload-cleanup, r=ZuseZ4
Move shared offload globals and define per-kernel globals once

This PR moves the shared LLVM global variables logic out of the `offload` intrinsic codegen and generates kernel-specific variables only ont he first call of the intrinsic.

r? `@ZuseZ4`

tracking:
- https://github.com/rust-lang/rust/issues/131513
2025-12-19 23:38:57 +01:00
Manuel Drehwald c34ea6e56d remove llvm_enzyme and enzyme fallbacks from most places, enable the autodiff frontend on nightly 2025-12-19 11:02:57 -08:00
Marcelo Domínguez 8bafb63202 Remove outdated comment 2025-12-19 13:27:14 +01:00
sgasho 4d12cb0fb8 refactor: initialize EnzymeWrapper in LlvmCodegenBackend::init 2025-12-16 00:32:25 +09:00
sgasho ddd5aad8a3 feat: dlopen Enzyme 2025-12-16 00:31:32 +09:00
Urgau 8cbfb26383 Overhaul filename handling for cross-compiler consistency
This commit refactors `SourceMap` and most importantly `RealFileName` to
make it self-contained in order to achieve cross-compiler consistency.

This is achieved:
 - by making `RealFileName` immutable
 - by only having `SourceMap::to_real_filename` create `RealFileName`
 - by also making `RealFileName` holds it's working directory,
   it's embeddable name and the remapped scopes
 - by making most `FileName` and `RealFileName` methods take a scope as
   an argument

In order for `SourceMap::to_real_filename` to know which scopes to apply
`FilePathMapping` now takes the current remapping scopes to apply, which
makes `FileNameDisplayPreference` and company useless and are removed.

The scopes type `RemapPathScopeComponents` was moved from
`rustc_session::config` to `rustc_span`.

The previous system for scoping the local/remapped filenames
`RemapFileNameExt::for_scope` is no longer useful as it's replaced by
methods on `FileName` and `RealFileName`.
2025-12-12 07:33:09 +01:00
Alina Sbirlea ad73972e99 Fix for LLVM22 making lowering decisions dependent on RuntimeLibraryInfo.
LLVM reference commit:
https://github.com/llvm/llvm-project/commit/04c81a99735c04b2018eeb687e74f9860e1d0e1b.
2025-12-04 20:23:00 +00:00
Stuart Cook 2b150f2c65 Rollup merge of #147936 - Sa4dUs:offload-intrinsic, r=ZuseZ4
Offload intrinsic

This PR implements the minimal mechanisms required to run a small subset of arbitrary offload kernels without relying on hardcoded names or metadata.

- `offload(kernel, (..args))`: an intrinsic that generates the necessary host-side LLVM-IR code.
- `rustc_offload_kernel`: a builtin attribute that marks device kernels to be handled appropriately.

Example usage (pseudocode):
```rust
fn kernel(x: *mut [f64; 128]) {
    core::intrinsics::offload(kernel_1, (x,))
}

#[cfg(target_os = "linux")]
extern "C" {
    pub fn kernel_1(array_b: *mut [f64; 128]);
}

#[cfg(not(target_os = "linux"))]
#[rustc_offload_kernel]
extern "gpu-kernel" fn kernel_1(x: *mut [f64; 128]) {
    unsafe { (*x)[0] = 21.0 };
}
```
2025-11-26 23:32:03 +11:00
Marcelo Domínguez 5128ce10a0 Implement offload intrinsic 2025-11-25 20:04:27 +01:00
bors 23f708107b Auto merge of #149170 - ZuseZ4:automate-offload-packager, r=oli-obk
automate gpu offloading - part 1

Automates step 1 from the rustc-dev-guide offload section:
https://rustc-dev-guide.rust-lang.org/offload/usage.html#compile-instructions
`"clang-offload-packager" "-o" "host.out" "--image=file=device.bc,triple=amdgcn-amd-amdhsa,arch=gfx90a,kind=openmp"`

Verified on an MI 250X

cc `@jhuber6,` `@kevinsala,` `@jdoerfert,` `@Sa4dUs`

r? oli-obk
2025-11-23 10:45:30 +00:00
bjorn3 2d7c571391 Remove SharedEmitter from CodegenContext 2025-11-23 10:33:52 +00:00
bjorn3 b93b4b003e Remove opts field from CodegenContext 2025-11-23 10:33:52 +00:00
bjorn3 7f7b3488c0 Introduce InlineAsmError type 2025-11-21 14:16:12 +00:00
Manuel Drehwald 89d50591c0 Replace the first of 4 binary invocations for offload 2025-11-21 02:41:17 -08:00
Quinn Okabayashi c7e50d0f37 Remove unused LLVMModuleRef argument 2025-11-12 15:46:08 +00:00
bors 87f9dcd5e2 Auto merge of #147935 - luca3s:add-rtsan, r=petrochenkov
Add LLVM realtime sanitizer

This is a new attempt at adding the [LLVM real-time sanitizer](https://clang.llvm.org/docs/RealtimeSanitizer.html) to rust.

Previously this was attempted in https://github.com/rust-lang/rfcs/pull/3766.

Since then the `sanitize` attribute was introduced in https://github.com/rust-lang/rust/pull/142681 and it is a lot more flexible than the old `no_santize` attribute. This allows adding real-time sanitizer without the need for a new attribute, like it was proposed in the RFC. Because i only add a new value to a existing command line flag and to a attribute i don't think an MCP is necessary.

Currently real-time santizer is usable in rust code with the [rtsan-standalone](https://crates.io/crates/rtsan-standalone) crate. This downloads or builds the sanitizer runtime and then links it into the rust binary.

The first commit adds support for more detailed sanitizer information.
The second commit then actually adds real-time sanitizer.
The third adds a warning against using real-time sanitizer with async functions, cloures and blocks because it doesn't behave as expected when used with async functions. I am not sure if this is actually wanted, so i kept it in a seperate commit.
The fourth commit adds the documentation for real-time sanitizer.
2025-11-08 12:24:15 +00:00
Lucas Baumann d198633b95 add realtime sanitizer 2025-11-06 13:20:12 +01:00
Manuel Drehwald 360b38cceb Fix device code generation, to account for an implicit dyn_ptr argument. 2025-11-06 03:34:38 -05:00
bors b01cc1cf01 Auto merge of #148516 - bjorn3:target_feature_parsing_improvements, r=WaffleLapkin
Move warning reporting from flag_to_backend_features to cfg_target_feature

This way warnings are emitted even in a check build.
2025-11-05 17:56:16 +00:00
bjorn3 1d34478147 Move warning reporting from flag_to_backend_features to cfg_target_feature
This way warnings are emitted even in a check build.
2025-11-05 10:48:29 +00:00
bors 8e0b68e63c Auto merge of #148507 - Zalathar:rollup-vvz4knr, r=Zalathar
Rollup of 6 pull requests

Successful merges:

 - rust-lang/rust#147355 (Add alignment parameter to `simd_masked_{load,store}`)
 - rust-lang/rust#147925 (Fix tests for big-endian)
 - rust-lang/rust#148341 (compiler: Fix a couple issues around cargo feature unification)
 - rust-lang/rust#148371 (Dogfood `trim_{suffix|prefix}` in compiler)
 - rust-lang/rust#148495 (Implement Path::is_empty)
 - rust-lang/rust#148502 (rustc-dev-guide subtree update)

r? `@ghost`
`@rustbot` modify labels: rollup
2025-11-05 07:25:39 +00:00
Tamir Duberstein 270e49b307 rustc_target: introduce Arch
Improve type safety by using an enum rather than strings.
2025-11-04 21:27:22 -05:00
Stuart Cook 21da1760da Rollup merge of #148371 - yotamofek:pr/dogfood-trim-prefix-suffix, r=fee1-dead
Dogfood `trim_{suffix|prefix}` in compiler

cc rust-lang/rust#142312
2025-11-05 10:59:19 +11:00
Yotam Ofek d28dcc7e79 Dogfood trim_{suffix|prefix} in compiler 2025-11-02 16:55:18 +02:00
bjorn3 964d3af793 Move a special case for rust_eh_personality from reachable_non_generics to cg_llvm
This is a workaround for an LLVM bug, so put it in cg_llvm.
2025-11-02 10:46:13 +00:00
Zalathar 73b734bf63 Pass debuginfo_compression through FFI as an enum 2025-10-25 23:58:19 +11:00
bors 4d94478977 Auto merge of #147826 - Muscraft:update-typos, r=Noratrieb
Update typos

I saw that `typos` was a few versions out of date and figured it would be a good idea to update it. Upgrading to `1.38.1` adds the [July](https://github.com/crate-ci/typos/issues/1331), [August](https://github.com/crate-ci/typos/issues/1345), and [September](https://github.com/crate-ci/typos/issues/1370) dictionary updates. As part of this change, I also sorted the configuration file.
2025-10-22 13:11:47 +00:00
Scott Schafer 12f6b9697f chore: Update typos to 1.38.1 2025-10-20 12:20:15 -06:00
Zalathar 98c95c966b Remove current code for embedding command-line args in PDB 2025-10-18 12:24:40 +11:00
Zalathar 3831b60ef3 Remove inherent methods from llvm::Type 2025-10-04 18:47:18 +10:00
Matthias Krüger c29fb2e57e Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4
TypeTree support in autodiff

# TypeTrees for Autodiff

## What are TypeTrees?
Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently.

## Structure
```rust
TypeTree(Vec<Type>)

Type {
    offset: isize,  // byte offset (-1 = everywhere)
    size: usize,    // size in bytes
    kind: Kind,     // Float, Integer, Pointer, etc.
    child: TypeTree // nested structure
}
```

## Example: `fn compute(x: &f32, data: &[f32]) -> f32`

**Input 0: `x: &f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,
        child: TypeTree::new()
    }])
}])
```

**Input 1: `data: &[f32]`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,  // -1 = all elements
        child: TypeTree::new()
    }])
}])
```

**Output: `f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 4, kind: Float,
    child: TypeTree::new()
}])
```

## Why Needed?
- Enzyme can't deduce complex type layouts from LLVM IR
- Prevents slow memory pattern analysis
- Enables correct derivative computation for nested structures
- Tells Enzyme which bytes are differentiable vs metadata

## What Enzyme Does With This Information:

Without TypeTrees (current state):
```llvm
; Enzyme sees generic LLVM IR:
define float ``@distance(ptr*`` %p1, ptr* %p2) {
; Has to guess what these pointers point to
; Slow analysis of all memory operations
; May miss optimization opportunities
}
```

With TypeTrees (our implementation):
```llvm
define "enzyme_type"="{[]:Float@float}" float ``@distance(``
    ptr "enzyme_type"="{[]:Pointer}" %p1,
    ptr "enzyme_type"="{[]:Pointer}" %p2
) {
; Enzyme knows exact type layout
; Can generate efficient derivative code directly
}
```

# TypeTrees - Offset and -1 Explained

## Type Structure

```rust
Type {
    offset: isize, // WHERE this type starts
    size: usize,   // HOW BIG this type is
    kind: Kind,    // WHAT KIND of data (Float, Int, Pointer)
    child: TypeTree // WHAT'S INSIDE (for pointers/containers)
}
```

## Offset Values

### Regular Offset (0, 4, 8, etc.)
**Specific byte position within a structure**

```rust
struct Point {
    x: f32, // offset 0, size 4
    y: f32, // offset 4, size 4
    id: i32, // offset 8, size 4
}
```

TypeTree for `&Point` (internal representation):
```rust
TypeTree(vec![
    Type { offset: 0, size: 4, kind: Float },   // x at byte 0
    Type { offset: 4, size: 4, kind: Float },   // y at byte 4
    Type { offset: 8, size: 4, kind: Integer }  // id at byte 8
])
```

Generates LLVM:
```llvm
"enzyme_type"="{[]:Float@float}"
```

### Offset -1 (Special: "Everywhere")
**Means "this pattern repeats for ALL elements"**

#### Example 1: Array `[f32; 100]`
```rust
TypeTree(vec![Type {
    offset: -1, // ALL positions
    size: 4,    // each f32 is 4 bytes
    kind: Float, // every element is float
}])
```

Instead of listing 100 separate Types with offsets `0,4,8,12...396`

#### Example 2: Slice `&[i32]`
```rust
// Pointer to slice data
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, // ALL slice elements
        size: 4,    // each i32 is 4 bytes
        kind: Integer
    }])
}])
```

#### Example 3: Mixed Structure
```rust
struct Container {
    header: i64,        // offset 0
    data: [f32; 1000],  // offset 8, but elements use -1
}
```

```rust
TypeTree(vec![
    Type { offset: 0, size: 8, kind: Integer }, // header
    Type { offset: 8, size: 4000, kind: Pointer,
        child: TypeTree(vec![Type {
            offset: -1, size: 4, kind: Float // ALL array elements
        }])
    }
])
```
2025-09-28 18:13:11 +02:00
Zalathar 85018f09f6 Use LLVMDisposeTargetMachine 2025-09-25 18:10:55 +10:00
Zalathar b17fb70d04 Add self-profile events for target-machine creation
These code paths are surprisingly hot in the `large-workspace` benchmark; it
would be handy to see some detailed timings and execution counts.
2025-09-21 21:37:15 +10:00
Karan Janthe e1258e79d6 autodiff: Add basic TypeTree with NoTT flag
Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-09-19 04:02:19 +00:00
Zalathar 8b0a254860 Move target machine command-line quoting from C++ to Rust 2025-09-18 15:25:25 +10:00
bjorn3 1991779bd4 Make llvm_enzyme a regular cargo feature
This makes it clearer that it is set by the build system rather than by
the rustc that compiles the current rustc. It also avoids bootstrap
needing to pass --check-cfg llvm_enzyme to rustc.
2025-09-15 15:31:56 +00:00
bjorn3 f2933b34a8 Remove want_summary argument from prepare_thin
It is always false nowadays. ThinLTO summary writing is instead done by
llvm_optimize.
2025-09-06 18:37:23 +00:00
bjorn3 e3d0b7d648 Remove thin_link_data method from ThinBufferMethods
It is only used within cg_llvm.
2025-09-06 18:37:23 +00:00
bjorn3 2cf94b92ca Ensure fat LTO doesn't merge everything into the allocator module 2025-09-06 13:31:41 +00:00
bjorn3 319fe230f0 Special case allocator module submission to avoid special casing it elsewhere
A lot of places had special handling just in case they would get an
allocator module even though most of these places could never get one or
would have a trivial implementation for the allocator module. Moving all
handling of the allocator module to a single place simplifies things a
fair bit.
2025-09-04 08:21:10 +00:00
Daniel Paoliello da8f230d5f Update to ar_archive_writer 0.5.1 2025-08-29 16:37:42 -07:00