As far as I can tell it was introduced to allow fat LTO with
-Clinker-plugin-lto. Later a change was made to automatically disable
ThinLTO summary generation when -Clinker-plugin-lto -Clto=fat is used,
so we can safely remove it.
abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.
For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.
For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)
Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.
When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.
This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
Add scalar support for offload
This PR adds scalar support to the offload feature. The scalar management has two main parts:
On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that.
On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type.
r? @ZuseZ4
This flag allows specifying the threshold size for placing static data
in large data sections when using the medium code model on x86-64.
When using -Ccode-model=medium, data smaller than this threshold uses
RIP-relative addressing (32-bit offsets), while larger data uses
absolute 64-bit addressing. This allows the compiler to generate more
efficient code for smaller data while still supporting data larger than
2GB.
This mirrors the -mlarge-data-threshold flag available in GCC and Clang.
The default threshold is 65536 bytes (64KB) if not specified, matching
LLVM's default behavior.
Allow inline calls to offload intrinsic
Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block.
r? `@ZuseZ4`
fixes: https://github.com/rust-lang/rust/issues/150413
remove llvm_enzyme and enzyme fallbacks from most places
Using dlopen to get symbols has the nice benefit that rustc itself doesn't depend on libenzyme symbols anymore. We can therefore delete most fallback implementations in the backend (independently of whether we enable enzyme or not). When trying to use autodiff on nightly, we will now fail with a nice error if and only if we fail to load libEnzyme-21.so in our backend.
Verified:
Build as nightly, without Enzyme
Build as nightly, with Enzyme
Build as stable (without Enzyme)
With this PR we will now run `tests/ui/autodiff` on nightly, the tests are passing.
r? `@kobzol`
autodiff: emit an error if we fail to find libEnzyme
Tested manually by moving libEnzyme-21.so away. We should adjust the error msg. once we have the component up.
It's the first usage within rustc of this experimental feature, but afaik we're open to dogfooding those for test purpose, right?
r? ``@Kobzol``
Introduces `BackendRepr::ScalableVector` corresponding to scalable
vector types annotated with `repr(scalable)` which lowers to a scalable
vector type in LLVM.
Co-authored-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>
Add LLVM realtime sanitizer
This is a new attempt at adding the [LLVM real-time sanitizer](https://clang.llvm.org/docs/RealtimeSanitizer.html) to rust.
Previously this was attempted in https://github.com/rust-lang/rfcs/pull/3766.
Since then the `sanitize` attribute was introduced in https://github.com/rust-lang/rust/pull/142681 and it is a lot more flexible than the old `no_santize` attribute. This allows adding real-time sanitizer without the need for a new attribute, like it was proposed in the RFC. Because i only add a new value to a existing command line flag and to a attribute i don't think an MCP is necessary.
Currently real-time santizer is usable in rust code with the [rtsan-standalone](https://crates.io/crates/rtsan-standalone) crate. This downloads or builds the sanitizer runtime and then links it into the rust binary.
The first commit adds support for more detailed sanitizer information.
The second commit then actually adds real-time sanitizer.
The third adds a warning against using real-time sanitizer with async functions, cloures and blocks because it doesn't behave as expected when used with async functions. I am not sure if this is actually wanted, so i kept it in a seperate commit.
The fourth commit adds the documentation for real-time sanitizer.
cg_llvm: Pass `debuginfo_compression` through FFI as an enum
There are only three possible values, making an enum more appropriate.
This avoids string allocation on the Rust side, and avoids ad-hoc `!strcmp` to convert back to an enum on the C++ side.
Extend attribute deduction to determine whether parameters using
indirect pass mode might have their address captured. Similarly to
the deduction of `readonly` attribute this information facilitates
memcpy optimizations.
Offload host2
r? `@oli-obk`
A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html
I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.
cg_llvm: Use `LLVMDIBuilderCreateGlobalVariableExpression`
- Part of rust-lang/rust#134001
- Follow-up to rust-lang/rust#146763
---
This PR dismantles the somewhat complicated `LLVMRustDIBuilderCreateStaticVariable` function, and replaces it with equivalent calls to `LLVMDIBuilderCreateGlobalVariableExpression` and `LLVMGlobalSetMetadata`.
A key difference is that the new code does not replicate the attempted downcast of `InitVal`. As far as I can tell, those downcasts were actually dead, because `llvm::ConstantInt` and `llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`. I tried replacing those code paths with fatal errors, and was unable to induce failure in any of the relevant test suites I ran.
I have also confirmed that if the calls to `create_static_variable` are commented out, debuginfo tests will fail, demonstrating some amount of relevant test coverage.
The new `DIBuilder` methods have been added via an extension trait, not as inherent methods, to avoid impeding rust-lang/rust#142897.
Note that the code in `LLVMRustDIBuilderCreateStaticVariable` that tried to
downcast `InitVal` appears to have been dead, because `llvm::ConstantInt` and
`llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`.
refactor: Remove `LLVMRustInsertPrivateGlobal` and `define_private_global`
Since it can easily be implemented using the existing LLVM C API in
terms of `LLVMAddGlobal` and `LLVMSetLinkage` and `define_private_global`
was only used in one place.
Work towards https://github.com/rust-lang/rust/issues/46437