Rollup merge of #146181 - Flakebi:dynamic-shared-memory, r=ZuseZ4,Sa4dus,workingjubilee,RalfJung,nikic,kjetilkjeka,kulst
Add intrinsic for launch-sized workgroup memory on GPUs
Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.
# Interface
With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.
It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).
All calls to this intrinsic return a pointer to the same address.
See the intrinsic documentation for more details.
## Alternative Interfaces
It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.
# Implementation Details
Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.
There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.
Tracking issue: rust-lang/rust#135516
privacy: Assert that compared visibilities are (usually) ordered
And make "greater than" (`>`) the new primary operation for comparing visibilities instead of "is at least" (`>=`).
Do not modify resolver outputs during lowering
Split from https://github.com/rust-lang/rust/pull/142830
I believe this achieves the same thing as https://github.com/rust-lang/rust/pull/153656 but in a much simpler way.
This PR forces AST->HIR lowering to stop mutating resolver outputs. Instead, it manages a few override maps that only live during lowering and are dropped afterwards.
r? @petrochenkov
cc @aerooneqq
Pass fields to `is_tuple_fields` instead of `SBValue` object
straightforward fix for a logic error. `is_tuple_fields` expects a `list`, so we pass that in instead of the value object.
Coincidentally, this also fixes one of the 3 DI tests that fails on `x86_64-pc-windows-gnu` (`tests/debuginfo/union-smoke.rs`)
Account for `GetSyntheticValue` failures
`GetSyntheticValue` returns an invalid `SBValue` if no synthetic is present. That wasn't a problem before when we were attaching synthetics to every type, but it won't be the case once github.com/rust-lang/rust/pull/155336 or similar lands. Additionally, codelldb subverts `lldb_commands` to apply similar behavior that doesn't attach synthetics to every type, so this fixes a regression there too.
Additionally, I removed 1 useless instance of `GetSyntheticValue`, since pointers should always be `IndirectionSyntheticProvider`, not `DefaultSyntheticProvider`.
bootstrap: Don't clone submodules unconditionally in dry-run
This made it very annoying to debug bootstrap itself, because every `--dry-run` invocation would start out by cloning LLVM, which is almost never necessary. Instead change a few Steps to properly support dry_run when no submodule is checked out.
I tested this by running all of `check`, `build`, `doc`, `dist`, `install`, `vendor`, `clippy`, `fix`, and `miri` with `--dry-run`.
Handle index projections in call destinations in DSE
Since call destinations are evaluated after call arguments, we can't turn copy arguments into moves if the same local is later used as an index projection in the call destination.
DSE call arg optimization: rust-lang/rust#113758
r? @cjgillot
cc @RalfJung
Avoid redundant clone suggestions in borrowck diagnostics
Fixesrust-lang/rust#153886
Removed redundant `.clone()` suggestions.
I found that there are two patterns to handle this issue while I was implementing:
- Should suggest only UFCS
- Should suggest only simple `.clone()`
For the target issue, we can just remove the UFCS (`<Option<String> as Clone>::clone(&selection.1)`) side.
However, for the `BorrowedContentSource::OverloadedDeref` pattern like `Rc<Vec<i32>>`, for instance the `borrowck-move-out-of-overloaded-auto-deref.rs` test case, I think we need to employ the UFCS way. The actual test case is:
```rust
//@ run-rustfix
use std::rc::Rc;
pub fn main() {
let _x = Rc::new(vec![1, 2]).into_iter();
//~^ ERROR [E0507]
}
```
And another error will be shown if we simply use the simple `.clone()` pattern. Like:
```rust
use std::rc::Rc;
pub fn main() {
let _x = Rc::new(vec![1, 2]).clone().into_iter();
}
```
then we will get
```
error[E0507]: cannot move out of an `Rc`
--> src/main.rs:5:14
|
5 | let _x = Rc::new(vec![1, 2]).clone().into_iter();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ----------- value moved due to this method call
| |
| move occurs because value has type `Vec<i32>`, which does not implement the `Copy` trait
|
note: `into_iter` takes ownership of the receiver `self`, which moves value
--> /playground/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/iter/traits/collect.rs:310:18
|
310 | fn into_iter(self) -> Self::IntoIter;
| ^^^^
help: you can `clone` the value and consume it, but this might not be your desired behavior
|
5 - let _x = Rc::new(vec![1, 2]).clone().into_iter();
5 + let _x = <Vec<i32> as Clone>::clone(&Rc::new(vec![1, 2])).into_iter();
|
For more information about this error, try `rustc --explain E0507`.
```
[Rust Playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=7e767bed3f1c573c03642f20f454ed03)
In this case, `Rc::clone` only increments the reference count and returns a new `Rc<Vec<i32>>`; it does not grant ownership of the inner `Vec<i32>`. As a result, calling into_iter() attempts to move the `Vec<i32>`, leading to the same E0507 error again.
On the other hand, in UFCS form:
```
<Vec<i32> as Clone>::clone(&Rc::new(vec![1, 2])).into_iter()
```
This explicitly calls `<Vec<i32> as Clone>::clone`, and the argument `&Rc<Vec<i32>>` is treated as `&Vec<i32>` via Rc’s `Deref` implementation. As a result, the `Vec<i32>` itself is cloned, yielding an owned `Vec<i32>`, which allows `into_iter()` to succeed, if my understanding is correct.
I addressed the issue as far as I could find the edge cases but please advice me if I'm overlooking something.
Rollup of 3 pull requests
Successful merges:
- rust-lang/rust#155754 (make the `core::ffi::va_list` module private)
- rust-lang/rust#155522 (cmse: test returning `MaybeUninit<T>`)
- rust-lang/rust#155741 (std: Refactor BufWriter::flush to use the `?` operator)
cmse: test returning `MaybeUninit<T>`
tracking issue: https://github.com/rust-lang/rust/issues/81391
tracking issue: https://github.com/rust-lang/rust/issues/75835
Some tests from https://github.com/rust-lang/rust/pull/147697 that already work and are useful. Extracting them shrinks that (currently blocked) PR.
The code in `tests/ui/cmse-nonsecure/cmse-nonsecure-call/return-via-stack.rs` checks that `MaybeUninit<T>` is considered abi-compatible with `T`. The code in `tests/ui/cmse-nonsecure/cmse-nonsecure-entry/params-via-stack.rs` really only tests that no errors/warnings are emitted.
r? davidtwco
Permit `{This}` in diagnostic attribute format literals
My motivation was that yesterday I wanted to write something like this and reference `$name` in the string literal.
```rust
pub mod sym {
// stuff here
}
macro_rules! my_macro {
($name:ident $(,)?) => {{
#[diagnostic::on_unknown(
message = "this is not present in symbol table",
note = "you must add it to rustc_span::symbol::symbol!"
)]
use sym::$name as name;
// ...
}}
}
```
That is (as far as I can tell) impossible or at least very unergonomic. This adds the ability to just reference the name of the item the attribute is on. I imagine that's useful for use inside macros generally, so it's also added for some other attributes.
The affected attributes are all unstable, it is not implemented for diagnostic::on_unimplemented (will do in its own PR).
Note that `{This}` is already usable in `#[rustc_on_unimplemented]`, so this does not implement it but just enables some more.
This PR also migrates one lint away from AttributeLintKind, and improves the messages for that lint.
Rollup of 12 pull requests
Successful merges:
- rust-lang/rust#149452 (Refactor out common code into a `IndexItem::new` constructor)
- rust-lang/rust#155621 (Document #[diagnostic::on_move] in the unstable book.)
- rust-lang/rust#155635 (delegation: rename `Self` generic param to `This` in recursive delegations)
- rust-lang/rust#155730 (Some cleanups around per parent disambiguators)
- rust-lang/rust#153537 (rustc_codegen_ssa: Define ELF flag value for sparc-unknown-linux-gnu)
- rust-lang/rust#155219 (Do not suggest borrowing enclosing calls for nested where-clause obligations)
- rust-lang/rust#155408 (rustdoc: Fix Managarm C Library name in cfg pretty printer)
- rust-lang/rust#155571 (Enable AddressSanitizer on arm-unknown-linux-gnueabihf and armv7-unknown-linux-gnueabihf)
- rust-lang/rust#155713 (test: Add a regression test for Apple platforms aborting on `free`)
- rust-lang/rust#155723 (Fix tier level for 5 thumb bare-metal ARM targets)
- rust-lang/rust#155735 (Fix typo by removing extra 'to')
- rust-lang/rust#155736 (Remove `AllVariants` workaround for rust-analyzer)
Remove `AllVariants` workaround for rust-analyzer
Part of https://github.com/rust-lang/rust/issues/155677
Removes the `ALL_VARIANTS` alias added to work around rust-analyzer not supporting `#![feature(macro_derive)]`, which has since been fixed (rust-lang/rust-analyzer/issues/21043).
Fix typo by removing extra 'to'
Fixesrust-lang/rust#155695
Fix a typo in the `std::convert` module documentation by removing an extra "to" in the module-level docs.
Fix tier level for 5 thumb bare-metal ARM targets
The spec files for 5 Thumb-mode bare-metal ARM targets incorrectly set tier: Some(2), while the documentation correctly lists them as Tier 3. This mismatch was introduced in PR #150556 — the intent was Tier 2 eventually, but these targets should sit at Tier 3 until a proper Tier 3 → Tier 2 promotion MCP is submitted and approved.
This PR changes tier: Some(2) → Some(3) in the following spec files, making them consistent with the docs:
thumbv7a-none-eabi
thumbv7a-none-eabihf
thumbv7r-none-eabi
thumbv7r-none-eabihf
thumbv8r-none-eabihf
PS: No doc changes needed — they already correctly state Tier 3.
r?
test: Add a regression test for Apple platforms aborting on `free`
Add a regression test for https://github.com/rust-lang/rust/issues/150898 to make users aware that if this test failures, they may encounter unusual behavior elsewhere.
Enable AddressSanitizer on arm-unknown-linux-gnueabihf and armv7-unknown-linux-gnueabihf
Add SanitizerSet::ADDRESS to the supported_sanitizers for the arm-unknown-linux-gnueabihf and armv7-unknown-linux-gnueabihf targets.
The AddressSanitizer is already enabled on the armv7-linux-androideabi platform, which shares the same ARM architecture. There is no reason these Linux GNU targets should not also support it, as the underlying LLVM support for ASan on 32-bit ARM is already in place.
Do not suggest borrowing enclosing calls for nested where-clause obligations
In rust-lang/rust#155088, the compiler was blaming the whole call expr instead of the value that actually failed the trait bound, so for foo(&[String::from("a")]) it was suggesting stuff like &foo(...). I changed the suggestion logic so it only emits borrow help if the expr it found actually matches the failed self type, and used the same check for the “similar impl exists” help too. So now the compiler should give the normal error + required bound note.
Fixrust-lang/rust#155088
Some cleanups around per parent disambiguators
r? @petrochenkov
follow-up to rust-lang/rust#155547
The two remaining uses are
* resolve_bound_vars, where it is a reasonable way to do it instead of having another field in the visitor that needs to get scoped (set & reset) every time we visit an opaque type. May still change that at some point, but it's not really an issue
* `create_def` in the resolver: will get removed together with my other refactorings for `node_id_to_def_id` (making that per-owner)
delegation: rename `Self` generic param to `This` in recursive delegations
This PR supports renaming of `Self` generic parameter to `This` in recursive delegations scenario, this allows propagation of `This` as we rely on `Self` naming to check whether it is implicit Self of a trait. Comment with a bit deeper explanation is in `uplift_delegation_generic_params`. Part of rust-lang/rust#118212.
r? @petrochenkov
This made it very annoying to debug bootstrap itself, because every
`--dry-run` invocation would start out by cloning LLVM, which is almost
never necessary. Instead change a few Steps to properly support dry_run
when no submodule is checked out.