Add intrinsic for launch-sized workgroup memory on GPUs
Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.
# Interface
With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.
It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).
All calls to this intrinsic return a pointer to the same address.
See the intrinsic documentation for more details.
## Alternative Interfaces
It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.
# Implementation Details
Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.
There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.
Tracking issue: rust-lang/rust#135516
Fix typo by removing extra 'to'
Fixesrust-lang/rust#155695
Fix a typo in the `std::convert` module documentation by removing an extra "to" in the module-level docs.
Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.
# Interface
With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.
It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).
All calls to this intrinsic return a pointer to the same address.
See the intrinsic documentation for more details.
## Alternative Interfaces
It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.
# Implementation Details
Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.
There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.
Generalize IO Traits for `Arc<T>` where `&T: IoTrait`
ACP: https://github.com/rust-lang/libs-team/issues/755
Tracking issue: https://github.com/rust-lang/rust/issues/154046
Related: rust-lang/rust#94744
## Description
After experimenting with rust-lang/rust#155625, I noticed `Seek` and `SeekFrom` can almost be moved to `core::io`. Unfortunately, the implementation of `Seek` for `Arc<File>` is a blocker for such a move, since `Arc` is not a fundamental type. This PR attempts to resolve this potential blocker by replacing the implementation with a more general alternative. An internal trait `IoHandle` has been added which types can implement to opt-in to `Read`/`Write`/`Seek` implementations for `Arc<Self>` as long as `&Self` implements said trait. Note that `BufRead` is excluded as the signature for `fill_buf` would require returning from a temporary.
Since this "blanket" implementation only applies to a single type which already implements the same traits, I believe this should have no user-facing impact.
If this PR was merged, rust-lang/rust#134190 could be replaced with a 2 line PR:
```rust
impl IoHandle for TcpStream {}
impl IoHandle for UnixStream {}
```
Likewise for any other types, a table of which can be found [here](https://github.com/rust-lang/libs-team/issues/504#issuecomment-2539569736). This is out of scope for this PR to avoid the need for an ACP.
---
## Notes
* See [this comment](https://github.com/rust-lang/rust/issues/154046#issuecomment-4303975612) for further details.
* No AI tooling of any kind was used during the creation of this PR.
Added a marker trait `IoHandle` which can be used by the standard library to opt-in types to a blanket implementation of the various IO traits on `Arc<T>` where `&T: IoTrait` for some `IoTrait`.
The marker is required to avoid types like `Arc<[u8]>` being included, since they don't have interior mutability and would not give expected results.
Emscripten and OpenBSD exit out of the build script early. Since
02014b06c1a3 ("c-b: Turn `mem-unaligned` from a feature to a cfg"), this
meant that the exit happened before all `rustc-check-cfg`s had been
emitted.
Rework the logic so these only skip the C build rather than the rest of
configuration.
Expand `Path::is_empty` docs
Give some reasons why you might want to check if a path is empty. The `Path::join` behaviour can be surprising if you're not aware it might happen.
Improved assumptions relating to isqrt
Improved various assumptions relating to values yielded by `isqrt`.
Does not solve but does improve rust-lang/rust#132763.
Re-openeing of rust-lang/rust#154115
Added assumptions are:
* if `x` is nonzero then `x.isqrt()` is nonzero
* `x.isqrt() <= x`
* `x.isqrt() * x.isqrt() <= x`
Document precision considerations of `Duration`-float methods
A `Duration` is essentially a 94-bit value (64-bit sec and ~30-bit ns),
so there's some inherent loss when converting to floating-point for
`mul_f64` and `div_f64`. We could go to greater lengths to compute these
with more accuracy, like rust-lang/rust#150933 or rust-lang/rust#154107,
but it's not clear that it's worth the effort. The least we can do is
document that some rounding is to be expected, which this commit does
with simple examples that only multiply or divide by `1.0`.
This also changes the `f32` methods to just forward to `f64`, so we keep
more of that duration precision, as the range is otherwise much more
limited there.
This commit performs some minor update within the standard library for
the `wasm32-wasip3` target. This target is a tier 3 target currently due
to the WASIp3 specification not being officially released. This commit
adds a dependency from the standard library on the `wasip3` crate in the
same manner as the `wasip1` and `wasip2` crates that it already depends
on. The use-sites, for randomness and environment variables, are then
updated to handle the wasip2/wasip3 multiplexing.
Fix redundant boolean comparison in `Mutex::try_lock`
Simplify boolean return in `Mutex::try_lock`.
Replace `expr == false` with `!expr` for cleaner code.
Rollup of 11 pull requests
Successful merges:
- rust-lang/rust#154654 (Move `std::io::ErrorKind` to `core::io`)
- rust-lang/rust#145270 (Fix an ICE observed with an explicit tail-call in a default trait method)
- rust-lang/rust#154895 (borrowck: Apply `user_arg_index` nomenclature more broadly)
- rust-lang/rust#155213 (resolve: Make sure visibilities of import declarations make sense)
- rust-lang/rust#155346 (`single_use_lifetimes`: respect `anonymous_lifetime_in_impl_trait`)
- rust-lang/rust#155517 (Add a test for Mach-O `#[link_section]` API inherited from LLVM)
- rust-lang/rust#155549 (Remove some unnecessary lifetimes.)
- rust-lang/rust#154248 (resolve : mark repr_simd as internal)
- rust-lang/rust#154772 (slightly optimize the `non-camel-case-types` lint)
- rust-lang/rust#155541 (Add `#[rust_analyzer::prefer_underscore_import]` to the traits in `rustc_type_ir::inherent`)
- rust-lang/rust#155544 (bootstrap: Make "detected modifications" for download-rustc less verbose)
resolve : mark repr_simd as internal
I changed ```repr_simd``` to ```internal``` and changed the position to ```feature-group-start: internal feature gates```.
closerust-lang/rust#154034
Move `std::io::ErrorKind` to `core::io`
* Update `rustdoc-html` tests for the new path
* Add `core_io` feature to control stability. This replaces the use of `core_io_borrowed_buf` on the `core::io` module itself.
* Re-export `core::io::ErrorKind` in `std::io::error`
Tweak `is_ascii_punctuation()` docs wording
These methods return `true` for characters with Unicode general category of punctuation (P), but also for those with general category of symbol (S).
@rustbot label A-docs
docs(num): fix stale link to `mem::Alignment`
This pull request updates a stale link to `mem::Alignment` in `num::IntErrorKind`.
In rust-lang/rust#153178, I added a link to `Alignment` in `IntErrorKind`, but I overlooked that `Alignment` had been moved from `core::ptr` to `core::mem`. Although it is still re-exported in `core::ptr`, this pull request points the link to its canonical location.
@rustbot label +A-docs
The `_punctuation` methods return `true` for characters with
Unicode general category of punctuation (P),
but also for those with general category
of symbol (S).
* Checking exhaustion will no longer be possible for `repr_bitpacked`. Moving `kind_from_prim` into an associated function, and setting it up to be moved into `core::io` as well.
* `ErrorKind::as_str` is private, but it's only usage is trivially replaced with `Display::fmt`
* The features io_error_inprogress, io_error_more, and io_error_uncategorized will all need to be enabled