Introduce `Unnormalized` wrapper
This is the first step of the [eager normalization](https://rust-lang.zulipchat.com/#narrow/channel/364551-t-types.2Ftrait-system-refactor/topic/Eager.20normalization.2C.20ahoy.21/with/582996293) series.
This PR introduce an `Unnormalized` wrapper and make most normalization routines consume it. The purpose is to make normalization explicit.
This PR contains no behavior change.
API changes are in the first two commit.
There're some normalization routines left untouched:
- `normalize` in the type checker of borrowck: better do it together with `field.ty()` returning `Unnormalized`.
- `normalize_with_depth`: only used inside the old solver. Can be done later.
- `query_normalize`: rarely used.
- misc local normalization helpers.
The compiler errors are mostly fixed via `ast-grep`, with exceptions handled manually.
Refactor FnDecl and FnSig non-type fields into a new wrapper type
#### Why this Refactor?
This PR is part of an initial cleanup for the [arg splat experiment](https://github.com/rust-lang/rust/issues/153629), but it's a useful refactor by itself.
It refactors the non-type fields of `FnDecl`, `FnSig`, and `FnHeader` into a new packed wrapper types, based on this comment in the `splat` experiment PR:
https://github.com/rust-lang/rust/pull/153697#discussion_r3004637413
It also refactors some common `FnSig` creation settings into their own methods. I did this instead of creating a struct with defaults.
#### Relationship to `splat` Experiment
I don't think we can use functional struct updates (`..default()`) to create `FnDecl` and `FnSig`, because we need the bit-packing for the `splat` experiment.
Bit-packing will avoid breaking "type is small" assertions for commonly used types when `splat` is added.
This PR packs these types:
- ExternAbi: enum + `unwind` variants (38) -> 6 bits
- ImplicitSelfKind: enum variants (5) -> 3 bits
- lifetime_elision_allowed, safety, c_variadic: bool -> 1 bit
#### Minor Changes
Fixes some typos, and applies rustfmt to clippy files that got skipped somehow.
preserve SIMD element type information
Preserve the SIMD element type and provide it to LLVM for better optimization.
This is relevant for AArch64 types like `int16x4x2_t`, see also https://github.com/llvm/llvm-project/issues/181514. Such types are defined like so:
```rust
#[repr(simd)]
struct int16x4_t([i16; 4]);
#[repr(C)]
struct int16x4x2_t(pub int16x4_t, pub int16x4_t);
```
Previously this would be translated to the opaque `[2 x <8 x i8>]`, with this PR it is instead `[2 x <4 x i16>]`. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher.
This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support.
discussion at [#t-compiler > loss of vector element type information](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/loss.20of.20vector.20element.20type.20information/with/584483611)
Implement EII for statics
This PR implements EII for statics. I've tried to mirror the implementation for functions in a few places, this causes some duplicate code but I'm also not really sure whether there's a clean way to merge the implementations.
This does not implement defaults for static EIIs yet, I will do that in a followup PR
Use closures more consistently in `dep_graph.rs`.
This file has several methods that take a `FnOnce() -> R` closure:
- `DepGraph::with_ignore`
- `DepGraph::with_query_deserialization`
- `DepGraph::with_anon_task`
- `DepGraphData::with_anon_task_inner`
It also has two methods that take a faux closure via an `A` argument and a `fn(TyCtxt<'tcx>, A) -> R` argument:
- DepGraph::with_task
- DepGraphData::with_task
The rationale is that the faux closure exercises tight control over what state they have access to. This seems silly when (a) they are passed a `TyCtxt`, and (b) when similar nearby functions take real closures. And they are more awkward to use, e.g. requiring multiple arguments to be gathered into a tuple. This commit changes the faux closures to real closures.
r? @Zalathar
Building on the previous change, support scalable vectors with
`simd_select`. Previous patches already landed the necessary changes in
the implementation of this intrinsic, but didn't allow scalable vector
arguments to be passed in.
Previously `sve_cast`'s implementation was abstracted to power both
`sve_cast` and `simd_cast` which supported scalable and non-scalable
vectors respectively. In anticipation of having to do this for another
`simd_*` intrinsic, `sve_cast` is removed and `simd_cast` is changed to
accept both scalable and non-scalable intrinsics, an approach that will
scale better to the other intrinsics.
hwaddress: automatically add `-Ctarget-feature=+tagged-globals`
Note that since HWAddressSanitizer is/should be a target modifier, we do not have to worry about whether this LLVM target feature changes the ABI.
Fixes: rust-lang/rust#148185
Use convergent attribute to funcs for GPU targets
On targets with convergent operations, we need to add the convergent attribute to all functions that run convergent operations. Following clang, we can conservatively apply the attribute to all functions when compiling for such a target and rely on LLVM optimizing away the attribute in cases where it is not necessary.
This affects the amdgpu and nvptx targets.
cc @kjetilkjeka, @kulst for nvptx
cc @ZuseZ4
r? @nnethercote, as you already reviewed this in the other PR
Split out from rust-lang/rust#149637, the part here should be uncontroversial.
Because the things in this module aren't MIR and don't use anything
from `rustc_middle::mir`. Also, modules that use `mono` often don't use
anything else from `rustc_middle::mir`.
Various LTO cleanups
* Move some special casing of thin local LTO into a single location.
* Move lto_import_only_modules handling for fat LTO earlier. There is no reason to keep it separate until right before pass the LTO modules to the codegen backend. For thin LTO this introduces `ThinLtoInput` to correctly handle incr comp caching.
* Remove the `Linker` type from cg_llvm. It previously helped deduplicate code for `-Zcombine-cgus`, but that flag no longer exists.
Part of https://github.com/rust-lang/compiler-team/issues/908
Abstract over the existing `simd_cast` intrinsic to implement a new
`sve_cast` intrinsic - this is better than allowing scalable vectors to
be used with all of the generic `simd_*` intrinsics.
Clang changed to representing tuples of scalable vectors as
structs rather than as wide vectors (that is, scalable vector types
where the `N` part of the `<vscale x N x ty>` type was multiplied by
the number of vectors). rustc mirrored this in the initial implementation
of scalable vectors.
Earlier versions of our patches used the wide vector representation and
our intrinsic patches used the legacy
`llvm.aarch64.sve.tuple.{create,get,set}{2,3,4}` intrinsics for creating
these tuples/getting/setting the vectors, which were only supported
due to LLVM's `AutoUpgrade` pass converting these intrinsics into
`llvm.vector.insert`. `AutoUpgrade` only supports these legacy intrinsics
with the wide vector representation.
With the current struct representation, Clang has special handling in
codegen for generating `insertvalue`/`extractvalue` instructions for
these operations, which must be replicated by rustc's codegen for our
intrinsics to use. This patch implements new intrinsics in
`core::intrinsics::scalable` (mirroring the structure of
`core::intrinsics::simd`) which rustc lowers to the appropriate
`insertvalue`/`extractvalue` instructions.
Instead of just using regular struct lowering for these types, which
results in an incorrect ABI (e.g. returning indirectly), use
`BackendRepr::ScalableVector` which will lower to the correct type and
be passed in registers.
This also enables some simplifications for generating alloca of scalable
vectors and greater re-use of `scalable_vector_parts`.
A LLVM codegen test demonstrating the changed IR this generates is
included in the next commit alongside some intrinsics that make these
tuples usable.
specifically this emits an error when
- a custom target is used
- `RiscV32 if self.llvm_abiname == "ilp32e"` this abi is used for 32-bit
embedded targets, and clang/llvm document that the ABI may change in
the future.
This file has several methods that take a `FnOnce() -> R` closure:
- `DepGraph::with_ignore`
- `DepGraph::with_query_deserialization`
- `DepGraph::with_anon_task`
- `DepGraphData::with_anon_task_inner`
It also has two methods that take a faux closure via an `A` argument and
a `fn(TyCtxt<'tcx>, A) -> R` argument:
- DepGraph::with_task
- DepGraphData::with_task
The rationale is that the faux closure exercises tight control over what
state they have access to. This seems silly when (a) they are passed a
`TyCtxt`, and (b) when similar nearby functions take real closures. And
they are more awkward to use, e.g. requiring multiple arguments to be
gathered into a tuple. This commit changes the faux closures to real
closures.
this enables packed-stack just as -mpacked-stack in clang and gcc.
packed-stack is needed on s390x for kernel development.
Co-authored-by: Ralf Jung <post@ralfj.de>
simd_fmin/fmax: make semantics and name consistent with scalar intrinsics
This is the SIMD version of https://github.com/rust-lang/rust/pull/153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added.
In terms of providers of this API:
- Miri, GCC, and cranelift already implement the new semantics, so no changes are needed.
- LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics.
In terms of consumers of this API:
- Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here.
- Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (https://github.com/rust-lang/stdarch/issues/2056).
Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`?
Also see https://github.com/rust-lang/rust/issues/153395.
Merge `fabsf16/32/64/128` into `fabs::<F>`
Following [a small conversation on Zulip](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/Float.20intrinsics/with/521501401) (and because I'd be interested in starting to contribute on Rust), I thought I'd give a try at merging the float intrinsics :)
This PR just merges `fabsf16`, `fabsf32`, `fabsf64`, `fabsf128`, as it felt like an easy first target.
Notes:
- I'm opening the PR for one intrinsic as it's probably easier if the shift is done one intrinsic at a time, but let me know if you'd rather I do several at a time to reduce the number of PRs.
- Currently this PR increases LOCs, despite being an attempt at simplifying the intrinsics/compilers. I believe this increase is a one time thing as I had to define new functions and move some things around, and hopefully future PRs/commits will reduce overall LoCs
- `fabsf32` and `fabsf64` are `#[rustc_intrinsic_const_stable_indirect]`, while `fabsf16` and `fabsf128` aren't; because `f32`/`f64` expect the function to be const, the generic version must be made indirectly stable too. We'd need to check with T-lang this change is ok; the only other intrinsics where there is such a mismatch is `minnum`, `maxnum` and `copysign`.
- I haven't touched libm because I'm not familiar with how it works; any guidance would be welcome!
debuginfo: emit DW_TAG_call_site entries
Set `FlagAllCallsDescribed` on function definition DIEs so LLVM emits DW_TAG_call_site entries, letting debuggers and analysis tools track tail calls.
Add `-Zsanitize=kernel-hwaddress`
The Linux kernel has a config option called `CONFIG_KASAN_SW_TAGS` that enables `-fsanitize=kernel-hwaddress`. This is not supported by Rust.
One slightly awkward detail is that `#[sanitize(address = "off")]` applies to both `-Zsanitize=address` and `-Zsanitize=kernel-address`. Probably it was done this way because both are the same LLVM pass. I replicated this logic here for hwaddress, but it might be undesirable.
Note that `#[sanitize(kernel_hwaddress = "off")]` could be supported as an annotation on statics, but since it's also missing for `#[sanitize(hwaddress = "off")]`, I did not add it.
MCP: https://github.com/rust-lang/compiler-team/issues/975
Tracking issue: https://github.com/rust-lang/rust/issues/154171
cc @rcvalle @maurer @ojeda
Unknown -> Unsupported compression algorithm
Both zstd and zlib are *known* compression algorithms, they just may not be supported by the backend. We shouldn't mislead users into e.g. thinking they made a typo.
cc https://github.com/rust-lang/rust/issues/120953