Specifically:
- `HashStable` -> `StableHash` (trait)
- `HashStable` -> `StableHash` (derive)
- `HashStable_NoContext` -> `StableHash_NoContext` (derive)
Note: there are some names in `compiler/rustc_macros/src/hash_stable.rs`
that are still to be renamed, e.g. `HashStableMode`.
Part of MCP 983.
`std::hash::Hash` looks like this:
```
pub trait Hash {
fn hash<H>(&self, state: &mut H)
where H: Hasher;
...
}
```
The method is generic.
In contrast, `HashStable` looks like this:
```
pub trait HashStable<Hcx> {
fn hash_stable(&self, hcx: &mut Hcx, hasher: &mut StableHasher);
}
```
and impls look like this (in crates upstream of `rustc_middle`):
```
impl<Hcx: HashStableContext> HashStable<Hcx> for Path {
fn hash_stable(&self, hcx: &mut Hcx, hasher: &mut StableHasher) {
...
}
}
```
or this (in `rustc_middle` and crates downstream of `rustc_middle`):
```
impl<'tcx> HashStable<StableHashingContext<'tcx>> for rustc_feature::Features {
fn hash_stable(&self, hcx: &mut StableHashingContext<'tcx>, hasher: &mut StableHasher) {
...
}
}
```
Differences to `std::hash::Hash`:
- The trait is generic, rather than the method.
- The way impls are written depends their position in the crate graph.
- This explains why we have both `derive(HashStable)` and
`derive(HashStable_Generic)`. The former is for the
downstream-of-`rustc_middle` case, the latter is for the upstream of
`rustc_middle` case.
Why the differences? It all boils down to `HashStable` and
`HashStableContext` being in different crates. But the previous commit
fixed that, which means `HashStable` can be simplified to this, with a
generic method:
```
pub trait HashStable {
fn hash_stable<Hcx: HashStableContext>(&self, hcx: &mut Hcx, hasher: &mut StableHasher);
}
```
and all impls look like this:
```
impl HashStable for Path {
fn hash_stable<Hcx: HashStableContext>(&self, hcx: &mut Hcx, hasher: &mut StableHasher) {
...
}
}
```
Other consequences:
- `derive(HashStable_Generic)` is no longer needed; `derive(HashStable)`
can be used instead.
- In this commit, `derive(HashStable_Generic` is made a synonym of
`derive(HashStable)`. The next commit will remove this synonym,
because it's a change that touches many lines.
- `#[stable_hash_generic]` is no longer needed (for `newtype_index`);
`#[stable_hash]` can be used instead.
- `#[stable_hash_no_context]` was already a synonym of
`#[stable_hash_generic]`, so it's also removed in favour of just
`#[stable_hash]`.
- The difference between `derive(HashStable)` and
`derive(HashStable_NoContext)` now comes down to the difference
between `synstructure::AddBounds::Generics` and
`synstructure::AddBounds::Fields`, which is basically "vanilla derive"
vs "(near) perfect derive".
- I have improved the comments on `HashStableMode` to better
explaining this subtle difference.
- `rustc_middle/src/ich/impls_syntax.rs` is no longer needed; the
relevant impls can be defined in the crate that defines the relevant
type.
- Occurrences of `for<'a> HashStable<StableHashingContext<'a>>` are
replaced with with `HashStable`, hooray.
- The commit adds a `HashStableContext::hashing_controls` method, which
is no big deal. (It's necessary for `AdtDefData::hash_stable`, which
calls `hashing_controls` and used to have an `hcx` that was a
concrete `StableHashingContext` but now has an `hcx` that is just
`Hcx: HashStableContext`.)
Overall this is a big simplification, removing a lot of confusing
complexity in stable hashing traits.
Prefer `-1` for `None`
Currently we pick "weird" numbers like `1114112` for `None::<char>`. While that's not *wrong*, it's kinda *unnatural* -- a human wouldn't make that choice.
This PR instead picks `-1` for thinge like `None::<char>` -- like [clang's `WEOF`](https://github.com/llvm/llvm-project/blob/63ae74b78a11f6c61136dbc445652929389eb9ab/libc/include/llvm-libc-macros/wchar-macros.h#L15) -- and `None::<bool>` and such.
Any enums with more than one niched value (so not `Result` nor `Option`) remain as they were before. Also we continue to use `0` when that's possible -- `-1` is only preferred when zero *isn't* possible.
---
Inspired when someone in discord posted an example like this <https://rust.godbolt.org/z/W94s9qdYW> and I thought it was odd that we're currently picking `-9223372036854775808` to be the value to store to mark an `Option<Vec<_>>` as `None`. (Especially since that needs an 8-byte immediate on x64, and writing `-1` is only a 4-byte immediate.)
Currently we pick "weird" numbers like `1114112` for `None::<char>`. While that's not *wrong*, it's kinda *unnatural* -- a human wouldn't make that choice.
This PR instead picks `-1` for thinge like `None::<char>` -- like clang's `WEOF` -- and `None::<bool>` and such.
Any enums with more than one niched value (so not `Result` nor `Option`) remain as they were before.
Add intrinsic for launch-sized workgroup memory on GPUs
Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.
# Interface
With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.
It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).
All calls to this intrinsic return a pointer to the same address.
See the intrinsic documentation for more details.
## Alternative Interfaces
It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.
# Implementation Details
Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.
There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.
Tracking issue: rust-lang/rust#135516
Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.
# Interface
With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.
It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).
All calls to this intrinsic return a pointer to the same address.
See the intrinsic documentation for more details.
## Alternative Interfaces
It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.
# Implementation Details
Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.
There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.
preserve SIMD element type information
Preserve the SIMD element type and provide it to LLVM for better optimization.
This is relevant for AArch64 types like `int16x4x2_t`, see also https://github.com/llvm/llvm-project/issues/181514. Such types are defined like so:
```rust
#[repr(simd)]
struct int16x4_t([i16; 4]);
#[repr(C)]
struct int16x4x2_t(pub int16x4_t, pub int16x4_t);
```
Previously this would be translated to the opaque `[2 x <8 x i8>]`, with this PR it is instead `[2 x <4 x i16>]`. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher.
This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support.
discussion at [#t-compiler > loss of vector element type information](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/loss.20of.20vector.20element.20type.20information/with/584483611)
Previously `sve_cast`'s implementation was abstracted to power both
`sve_cast` and `simd_cast` which supported scalable and non-scalable
vectors respectively. In anticipation of having to do this for another
`simd_*` intrinsic, `sve_cast` is removed and `simd_cast` is changed to
accept both scalable and non-scalable intrinsics, an approach that will
scale better to the other intrinsics.
Invert dependency between `rustc_errors` and `rustc_abi`.
Currently, `rustc_errors` depends on `rustc_abi`, which depends on `rustc_error_messages`. This is a bit odd.
`rustc_errors` depends on `rustc_abi` for a single reason: `rustc_abi` defines a type `TargetDataLayoutErrors` and `rustc_errors` impls `Diagnostic` for that type.
We can get a more natural relationship by inverting the dependency, moving the `Diagnostic` trait upstream. Then `rustc_abi` defines `TargetDataLayoutErrors` and also impls `Diagnostic` for it. `rustc_errors` is already pretty far upstream in the crate graph, it doesn't hurt to push it a little further because errors are a very low-level concept.
r? @davidtwco
Instead of just using regular struct lowering for these types, which
results in an incorrect ABI (e.g. returning indirectly), use
`BackendRepr::ScalableVector` which will lower to the correct type and
be passed in registers.
This also enables some simplifications for generating alloca of scalable
vectors and greater re-use of `scalable_vector_parts`.
A LLVM codegen test demonstrating the changed IR this generates is
included in the next commit alongside some intrinsics that make these
tuples usable.
`derive(HashStable_Generic)` generates impls like this:
```
impl<__CTX> HashStable<__CTX> for ExpnKind
where
__CTX: crate::HashStableContext
{
fn hash_stable(&self, hcx : &mut __CTX, __hasher: &mut StableHasher) {
...
}
}
```
This is used for crates that are upstream of `rustc_middle`.
The `crate::HashStableContext` bound means every crate that uses
`derive(HashStable_Generic)` must provide (or import) a trait
`HashStableContext` which `rustc_middle` then impls. In `rustc_span`
this trait is sensible, with three methods. In other crates, this trait
is empty, and there is the following trait hierarchy:
```
rustc_session::HashStableContext
| |
| rustc_hir::HashStableContext
| / \
rustc_ast::HashStableContext rustc_abi::HashStableContext
|
rustc_span::HashStableContext
```
All very strange and unnecessary. This commit changes
`derive(HashStable_Generic)` to use `rustc_span::HashStableContext`
instead of `crate::HashStableContext`. This eliminates the need for all
the empty `HashStableContext` traits and impls. Much better.
Error enum names should not be plural. Even though there are multiple
possible errors, each instance of an error enum describes a single
error. There are dozens of singular error enum names, and only two
plural error enum names. This commit makes them both singular.
Currently, `rustc_errors` depends on `rustc_abi`, which depends on
`rustc_error_messages`. This is a bit odd.
`rustc_errors` depends on `rustc_abi` for a single reason: `rustc_abi`
defines a type `TargetDataLayoutErrors` and `rustc_errors` impls
`Diagnostic` for that type.
We can get a more natural relationship by inverting the dependency,
moving the `Diagnostic` trait upstream. Then `rustc_abi` defines
`TargetDataLayoutErrors` and also impls `Diagnostic` for it.
`rustc_errors` is already pretty far upstream in the crate graph, it
doesn't hurt to push it a little further because errors are a very
low-level concept.
interpret: go back to regular string interpolation for error messages
Using the translatable diagnostic infrastructure adds a whole lot of boilerplate which isn't actually useful for const-eval errors, so let's get rid of it. This effectively reverts https://github.com/rust-lang/rust/pull/111677. That PR effectively added 1000 lines and this PR only removes around 600 -- the difference is caused by (a) keeping some of the types around for validation, where we can use them to share error strings and to trigger the extra help for pointer byte shenanigans during CTFE, and (b) this not being a full revert of rust-lang/rust#111677; I am not touching diagnostics outside the interpreter such as all the const-checking code which also got converted to fluent in the same PR.
The last commit does something similar for `LayoutError`, which also helps deduplicate a bunch of error strings. I can make that into a separate PR if you prefer.
r? @oli-obk
Fixes https://github.com/rust-lang/rust/issues/113117
Fixes https://github.com/rust-lang/rust/issues/116764
Fixes https://github.com/rust-lang/rust/issues/112618
Instead of defaulting to `None` it now defaults to `Align::ONE` i.e.
no alignment restriction. Codegen test changes are due to us now skipping
`align 1` annotations (they are useless; not skipping them makes all the
raw pointers gain an `align 1` annotation which doesn't seem any good)
Make `size`/`align` always correct rather than conditionally on the
`safe` field. This makes it less error prone and easier to work with for
`MaybeDangling` / potential future pointer kinds like `Aligned<_>`.
abi: add a rust-preserve-none calling convention
This is the conceptual opposite of the rust-cold calling convention and is particularly useful in combination with the new `explicit_tail_calls` feature.
For relatively tight loops implemented with tail calling (`become`) each of the function with the regular calling convention is still responsible for restoring the initial value of the preserved registers. So it is not unusual to end up with a situation where each step in the tail call loop is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived and more like a conceptual inverse of `rust-cold`, but could not come with a great name (`rust-cold` is itself not a great name: cold in what context? from which perspective? is it supposed to mean that the function is rarely called?)
This is the conceptual opposite of the rust-cold calling convention and
is particularly useful in combination with the new `explicit_tail_calls`
feature.
For relatively tight loops implemented with tail calling (`become`) each
of the function with the regular calling convention is still responsible
for restoring the initial value of the preserved registers. So it is not
unusual to end up with a situation where each step in the tail call loop
is spilling and reloading registers, along the lines of:
foo:
push r12
; do things
pop r12
jmp next_step
This adds up quickly, especially when most of the clobberable registers
are already used to pass arguments or other uses.
I was thinking of making the name of this ABI a little less LLVM-derived
and more like a conceptual inverse of `rust-cold`, but could not come
with a great name (`rust-cold` is itself not a great name: cold in what
context? from which perspective? is it supposed to mean that the
function is rarely called?)