Store defids instead of symbol names in the aliases list
I was honestly surprised this worked in the past. This causes a cycle error since we now compute a symbol name in codegen_attrs, and then compute codegen attrs when we try to get the symbol name.
It only worked when there weren't any codegen attributes to begin with, causing symbol name computation to skip the call to codegen_attrs.
Like this we won't have the same problem.
r? @bjorn3
`c_variadic`: provide our own `va_arg` implementation for more targets
tracking issue: https://github.com/rust-lang/rust/issues/44930
Provide our own implementations in order to guarantee the behavior of `va_arg`. We will only be able to stabilize `c_variadic` on targets where we know and guarantee the properties of `va_arg`.
r? workingjubilee
tests/ui/runtime/on-broken-pipe/with-rustc_main.rs: Not needed so remove
related: https://github.com/rust-lang/rust/issues/145899#issuecomment-3705550673
print error from EnzymeWrapper::get_or_init(sysroot) as a note
r? @ZuseZ4
e.g.
1. when libEnzyme not found
```shell
$ rustc +stage1 -Z autodiff=Enable -C lto=fat src/main.rs
error: autodiff backend not found in the sysroot: failed to find a `libEnzyme-21` folder in the sysroot candidates:
* /Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/stage1/lib
|
= note: it will be distributed via rustup in the future
```
2. when could not load libEnzyme successfully
```shell
rustc +stage1 -Z autodiff=Enable -C lto=fat src/main.rs
error: failed to load our autodiff backend: DlOpen { source: "dlopen(/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/stage1/lib/rustlib/aarch64-apple-darwin/lib/libEnzyme-21.dylib, 0x0005): tried: \'/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/stage1/lib/rustlib/aarch64-apple-darwin/lib/libEnzyme-21.dylib\' (slice is not valid mach-o file), \'/System/Volumes/Preboot/Cryptexes/OS/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/stage1/lib/rustlib/aarch64-apple-darwin/lib/libEnzyme-21.dylib\' (no such file), \'/Volumes/WD_BLACK_SN850X_HS_1TB/rust-lang/rust/build/aarch64-apple-darwin/stage1/lib/rustlib/aarch64-apple-darwin/lib/libEnzyme-21.dylib\' (slice is not valid mach-o file)" }
```
RISC-V: Implement (Zkne or Zknd) intrinsics correctly
On rust-lang/stdarch#1765, it has been pointed out that two RISC-V (64-bit only) intrinsics to perform AES key scheduling have wrong target feature.
`aes64ks1i` and `aes64ks2` instructions require *either* Zkne (scalar cryptography: AES encryption) or Zknd (scalar cryptography: AES decryption) extension (or both) but corresponding Rust intrinsics (in `core::arch::riscv64`) required *both* Zkne and Zknd extensions.
An excerpt from the original intrinsics:
```rust
#[target_feature(enable = "zkne", enable = "zknd")]
```
To fix that, we need to:
1. Represent a condition where *either* Zkne or Zknd is available and
2. Workaround an issue: `llvm.riscv.aes64ks1i` / `llvm.riscv.aes64ks2` LLVM intrinsics require either Zkne or Zknd extension.
This PR attempts to resolve them by:
1. Adding a perma-unstable RISC-V target feature: `zkne_or_zknd` (implied from both `zkne` and `zknd`) and
2. Using inline assembly to construct machine code directly (because `zkne_or_zknd` alone cannot imply neither Zkne nor Zknd, we cannot use LLVM intrinsics).
The author confirmed that we can construct an AES key scheduling function with decent performance using fixed `aes64ks1i` and `aes64ks2` intrinsics (with optimization enabled).
Allow inline calls to offload intrinsic
Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block.
r? `@ZuseZ4`
fixes: https://github.com/rust-lang/rust/issues/150413
Intrinsics only need a fraction of the functionality offered by
BuilderMethods::call and in particular don't need the FnAbi to be
computed other than (currently) as step towards computing the function
value type.
The `f16` type works on the LoongArch target starting from LLVM 21.
However, the current minimum supported external LLVM version is 20,
so `f16` must not be enabled on LoongArch for LLVM version < 21.
MGCA: Support struct expressions without intermediary anon consts
r? oli-obk
tracking issue: rust-lang/rust#132980Fixesrust-lang/rust#127972Fixesrust-lang/rust#137888Fixesrust-lang/rust#140275
due to delaying a bug instead of ICEing in HIR ty lowering.
### High level goal
Under `feature(min_generic_const_args)` this PR adds another kind of const argument. A struct/variant construction const arg kind. We represent the values of the fields as themselves being const arguments which allows for uses of generic parameters subject to the existing restrictions present in `min_generic_const_args`:
```rust
fn foo<const N: Option<u32>>() {}
trait Trait {
#[type_const]
const ASSOC: usize;
}
fn bar<T: Trait, const N: u32>() {
// the initializer of `_0` is a `N` which is a legal const argument
// so this is ok.
foo::<{ Some::<u32> { 0: N } }>();
// this is allowed as mgca supports uses of assoc consts in the
// type system. ie `<T as Trait>::ASSOC` is a legal const argument
foo::<{ Some::<u32> { 0: <T as Trait>::ASSOC } }>();
// this on the other hand is not allowed as `N + 1` is not a legal
// const argument
foo::<{ Some::<u32> { 0: N + 1 } }>();
}
```
This PR does not support uses of const ctors, e.g. `None`. And also does not support tuple constructors, e.g. `Some(N)`. I believe that it would not be difficult to add support for such functionality after this PR lands so have left it out deliberately.
We currently require that all generic parameters on the type being constructed be explicitly specified. I haven't really looked into why that is but it doesn't seem desirable to me as it should be legal to write `Some { ... }` in a const argument inside of a body and have that desugar to `Some::<_> { ... }`. Regardless this can definitely be a follow-up PR and I assume this is some underlying consistency with the way that elided args are handled with type paths elsewhere.
This PRs implementation of supporting struct expressions is somewhat incomplete. We don't handle `Foo { ..expr }` at all and aren't handling privacy/stability. The printing of `ConstArgKind::Struct` HIR nodes doesn't really exist either :')
I've tried to keep the implementation here somewhat deliberately incomplete as I think a number of these issues are actually quite small and self contained after this PR lands and I'm hoping it could be a good set of issues to mentor newer contributors on 🤔 I just wanted the "bare minimum" required to actually demonstrate that the previous changes are "necessary".
### `ValTree` now recurse through `ty::Const`
In order to actually represent struct/variant construction in `ty::Const` without going through an anon const we would need to introduce some new `ConstKind` variant. Let's say some hypothetical `ConstKind::ADT(Ty<'tcx>, List<Const<'tcx>>)`.
This variant would represent things the same way that `ValTree` does with the first element representing the `VariantIdx` of the enum (if its an enum), and then followed by a list of field values in definition order.
This *could* work but there are a few reasons why it's suboptimal.
First it would mean we have a second kind of `Const` that can be normalized. Right now we only have `ConstKind::Unevaluated` which possibly needs normalization. Similarly with `TyKind` we *only* have `TyKind::Alias`. If we introduced `ConstKind::ADT` it would need to be normalized to a `ConstKind::Value` eventually. This feels to me like it has the potential to cause bugs in the long run where only `ConstKind::Unevaluated` is handled by some code paths.
Secondly it would make type equality/inference be kind of... weird... It's desirable for `Some { 0: ?x } eq Some { 0: 1_u32 }` to result in `?x=1_u32`. I can't see a way for this to work with this `ConstKind::ADT` design under the current architecture for how we represent types/consts and generally do equality operations.
We would need to wholly special case these two variants in type equality and have a custom recursive walker separate from the existing architecture for doing type equality. It would also be somewhat unique in that it's a non-rigid `ty::Const` (it can be normalized more later on in type inference) while also having somewhat "structural" equality behaviour.
Lastly, it's worth noting that its not *actually* `ConstKind::ADT` that we want. It's desirable to extend this setup to also support tuples and arrays, or even references if we wind up supporting those in const generics. Therefore this isn't really `ConstKind::ADT` but a more general `ConstKind::ShallowValue` or something to that effect. It represents at least one "layer" of a types value :')
Instead of doing this implementation choice we instead change `ValTree::Branch`:
```rust
enum ValTree<'tcx> {
Leaf(ScalarInt),
// Before this PR:
Branch(Box<[ValTree<'tcx>]>),
// After this PR
Branch(Box<[Const<'tcx>]>),
}
```
The representation for so called "shallow values" is now the same as the representation for the *entire* full value. The desired inference/type equality behaviour just falls right out of this. We also don't wind up with these shallow values actually being non-rigid. And `ValTree` *already* supports references/tuples/arrays so we can handle those just fine.
I think in the future it might be worth considering inlining `ValTree` into `ty::ConstKind`. E.g:
```rust
enum ConstKind {
Scalar(Ty<'tcx>, ScalarInt),
ShallowValue(Ty<'tcx>, List<Const<'tcx>>),
Unevaluated(UnevaluatedConst<'tcx>),
...
}
```
This would imply that the usage of `ValTree`s in patterns would now be using `ty::Const` but they already kind of are anyway and I think that's probably okay in the long run. It also would mean that the set of things we *could* represent in const patterns is greater which may be desirable in the long run for supporting things such as const patterns of const generic parameters.
Regardless, this PR doesn't actually inline `ValTree` into `ty::ConstKind`, it only changes `Branch` to recurse through `Const`. This change could be split out of this PR if desired.
I'm not sure if there'll be a perf impact from this change. It's somewhat plausible as now all const pattern values that have nesting will be interning a lot more `Ty`s. We shall see :>
### Forbidding generic parameters under mgca
Under mgca we now allow all const arguments to resolve paths to generic parameters. We then *later* actually validate that the const arg should be allowed to access generic parameters if it did wind up resolving to any.
This winds up just being a lot simpler to implement than trying to make name resolution "keep track" of whether we're inside of a non-anon-const const arg and then encounter a `const { ... }` indicating we should now stop allowing resolving to generic parameters.
It's also somewhat in line with what we'll need for a `feature(generic_const_args)` where we'll want to decide whether an anon const should have any generic parameters based off syntactically whether any generic parameters were used. Though that design is entirely hypothetical at this point :)
### Followup Work
- Make HIR ty lowering check whether lowering generic parameters is supported and if not lower to an error type/const. Should make the code cleaner, fix some other bugs, and maybe(?) recover perf since we'll be accessing less queries which I think is part of the perf regression of this PR
- Make the ValTree setup less scuffed. We should find a new name for `ConstKind::Value` and the `Val` part of `ValTree` and `ty::Value` as they no longer correspond to a fully normalized structure. It may also be worth looking into inlining `ValTreeKind` into `ConstKind` or atleast into `ty::Value` or sth 🤔
- Support tuple constructors and const constructors not just struct expressions.
- Reduce code duplication between HIR ty lowering's handling of struct expressions, and HIR typeck's handling of struct expressions
- Try fix perf https://github.com/rust-lang/rust/pull/149114#issuecomment-3668038853. Maybe this will clear up once we clean up `ValTree` a bit and stop doing double interning and whatnot
remove llvm_enzyme and enzyme fallbacks from most places
Using dlopen to get symbols has the nice benefit that rustc itself doesn't depend on libenzyme symbols anymore. We can therefore delete most fallback implementations in the backend (independently of whether we enable enzyme or not). When trying to use autodiff on nightly, we will now fail with a nice error if and only if we fail to load libEnzyme-21.so in our backend.
Verified:
Build as nightly, without Enzyme
Build as nightly, with Enzyme
Build as stable (without Enzyme)
With this PR we will now run `tests/ui/autodiff` on nightly, the tests are passing.
r? `@kobzol`
rustc_codegen_llvm: Tidying of `update_target_reliable_float_cfg`
This PR simplifies floating type handling through `update_target_reliable_float_cfg` based on several facts:
1. Major changes in behavior normally occurs only on the major LLVM upgrade.
2. The first release of LLVM 20.x.x is 20.1.0.
Due to the first fact, we can normally ignore minor and patch releases of LLVM and we can remove obscure variables like `lt_xx_x_x`.
The second fact is missed when the minimum LLVM version is raised to LLVM 20 (cf. rust-lang/rust#145071) and one "fixed in LLVM 20" case can be safely removed (another cannot be removed since it's fixed on LLVM 20.1.1).
It also reorders certain `match` clauses by the architecture when there's no problems reordering it.
Note that, an LLVM issue on MIPS is fixed on LLVM 20.1.**0** and another on AArch64 is fixed on LLVM 20.1.**1**.
Originally, they are both considered fixed on LLVM 20.1.**1** but the author separated them into two cases (so that the MIPS bug checking can be removed).
This commit simplifies floating type handling through
`update_target_reliable_float_cfg` based on several facts:
1. Major changes in behavior normally occurs only
on the major LLVM upgrade.
2. The first release of LLVM 20.x.x is 20.1.0.
Due to the first fact, we can normally ignore minor and patch releases
of LLVM and we can remove obscure variables like `lt_xx_x_x` (still,
there is a case where checking for patch version is required).
The second fact is missed when the minimum LLVM version is raised to
LLVM 20 and one "fixed in LLVM 20" case can be safely removed.
... in `update_target_reliable_float_cfg`, based on the actual changes.
The AArch64 issue is fixed on LLVM 20.1.1 while the MIPS issue is fixed
on LLVM 20.1.0 (the first LLVM 20 release).
This commit distinguishes two separate cases.
Move shared offload globals and define per-kernel globals once
This PR moves the shared LLVM global variables logic out of the `offload` intrinsic codegen and generates kernel-specific variables only ont he first call of the intrinsic.
r? `@ZuseZ4`
tracking:
- https://github.com/rust-lang/rust/issues/131513
Because some AES key scheduling instructions require *either* Zkne or
Zknd extension, we must have a target feature to represent
`(Zkne || Zknd)`.
This commit adds (perma-unstable) target feature to the RISC-V
architecture: `zkne_or_zknd` for this purpose.
Helped-by: sayantn <sayantn05@gmail.com>
Implement va_arg for Hexagon targets
Implements proper variadic argument handling for hexagon-unknown-linux-musl targets using a 3-pointer VaList structure compatible with LLVM's HexagonBuiltinVaList implementation.
* Handles register save area vs overflow area transition
* Provides proper 4-byte and 8-byte alignment for arguments
* Only activates for hexagon+musl targets via Arch::Hexagon & Env::Musl
autodiff: emit an error if we fail to find libEnzyme
Tested manually by moving libEnzyme-21.so away. We should adjust the error msg. once we have the component up.
It's the first usage within rustc of this experimental feature, but afaik we're open to dogfooding those for test purpose, right?
r? ``@Kobzol``
Introduces `BackendRepr::ScalableVector` corresponding to scalable
vector types annotated with `repr(scalable)` which lowers to a scalable
vector type in LLVM.
Co-authored-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>