In commit "test: Consolidate `Hexf`, `Hexi`, and the `Hex` trait" we
unintentionally lost the difference between float hex and bitwise hex
formatting; instead, the float hex was getting printed twice. Resolve
this by printing the integer hex whenever the `-` format modifier is
specified. This also makes things simpler because we no longer need to
keep track of whether an `impl DisplayHex` is a float with `.to_bits()`
available or an integer without it, we can always just try to print both
forms.
As a result, we can use a common error message for all validation
checks.
Implement `__clear_cache` for Hexagon targets in Rust. Hexagon has
separate instruction and data caches, so this flushes dirty data cache
lines with `dccleaninva`, invalidates stale instruction cache lines
with `icinva`, then issues an `isync` barrier.
Based on the compiler-rt implementation from
llvm/llvm-project#188411.
constify DoubleEndedIterator
The only functions that can't be constified are `advance_back_by` (requires const range or const-hack), `nth_back` (requires `advance_back_by`), and `rfind` (requires const closures).
I've put it under `const_iter`, but I can open a separate tracking issue, though I think tracking all `Iterator` traits separately would be quite annoying, and we probably would prefer to constify them together anyway.
Rollup of 4 pull requests
Successful merges:
- rust-lang/rust#154460 (Deduplication: Pulled common logic out from lower_const_arg_struct)
- rust-lang/rust#154609 (Enable `#[diagnostic::on_const]` for local impls)
- rust-lang/rust#154678 (Introduce #[diagnostic::on_move] on `Rc`)
- rust-lang/rust#154902 (rustdoc: Inherit inline attributes for declarative macros)
Explicitly forget the zero remaining elements in `vec::IntoIter::fold()`.
[Original description:] ~~This seems to help LLVM notice that dropping the elements in the destructor of `IntoIter` is not necessary. In cases it doesn’t help, it should be cheap since it is just one assignment.~~
This PR adds a function to `vec::IntoIter()` which is used used by `fold()` and `spec_extend()`, when those operations complete, to forget the zero remaining elements and only deallocate the allocation, ensuring that there will never be a useless loop to drop zero remaining elements when the iterator is dropped.
This is my first ever attempt at this kind of codegen micro-optimization in the standard library, so please let me know what should go into the PR or what sort of additional systematic testing might indicate this is a good or bad idea.
library: no `cfg(target_arch)` on scalable intrinsics
These intrinsics don't technically need to be limited to a specific architecture, they'll probably only make sense to use on AArch64, but this just makes it harder to use them in stdarch where it is appropriate (such as on `arm64ec`): requiring a rustc patch to land and be on nightly before stdarch work can proceed. So let's just not `cfg` them at all, they're perma-unstable anyway.
Fixes CI failure in rust-lang/stdarch#2071
c-b: Export inverse hyperbolic trigonometric functions
Since a1feab1638 ("Use libm for acosh and asinh"), the standard library may link these functions to get a more accurate approximation; however, some targets do not have the needed symbols available. Add them to the compiler-builtins export list to make sure the fallback is usable.
Closes: https://github.com/rust-lang/rust/issues/154919
library: std: motor: use OS' process::exit in abort_internal
abort_internal() is used in panics; if it calls core::intrinsics::abort(), the process triggers an invalid op code (on x86_64), which is a much harder "abort" than a user-controlled exit via a panic.
Most other OSes don't use core::intrinsics::abort() here, but either libc::abort(), or a native OS abort/exit API.
Add more info about where autodiff can be applied
It's taken quite a few years, but we finally have a PR open to distribute Enzyme: https://github.com/rust-lang/rust/pull/154754
I therefore went over the docs once more and noticed we don't explain a lot of the most basic features, which we added over the years and have since taken for granted.
@Sa4dUs, do you think there are more interesting cases that we are missing?
Generally, there's still a lot of complexity in it, especially for people who haven't used Enzyme before.
To some extent, that's just a result of my general design goal to expose all performance-relevant features of Enzyme, and let users explore nice abstractions on top if it, via crates. Since we don't have those nightly users yet, users haven't had time to build nicer abstractions on top of it.
I also feel like a more guided book would be a better first introduction to Enzyme, but for now I just focused on the list of features.
r? @oli-obk
coretests: add argument order regression tests for min_by/max_by/minmax_by
PR rust-lang/rust#136307 introduced a regression in min_by, max_by, and minmax_by:
the compare closure received arguments as (v2, v1) instead of (v1, v2),
contrary to the documented contract.
Although this was fixed in rust-lang/rust#139357, no regression tests were added.
This PR adds regression tests for all three functions, verifying that compare
always receives arguments in the documented order (v1, v2).
As a bonus: first coretests coverage for minmax_by.
Fix pin docs
Split a long sentence to improve readability.
The original sentence required multiple readings for me to understand as a non-native speaker. The revised version is clearer and more readable, and likely easier for others as well.
These intrinsics don't technically need to be limited to a specific
architecture, they'll probably only make sense to use on AArch64,
but this just makes it harder to use them in stdarch where it is
appropriate (such as on `arm64ec`), requiring a rustc patch to land and
be on nightly before stdarch work can proceed - so just don't `cfg` them
at all.
Since a1feab1638 ("Use libm for acosh and asinh"), the standard
library may link these functions to get a more accurate approximation;
however, some targets do not have the needed symbols available. Add them
to the compiler-builtins export list to make sure the fallback is
usable.
Add an assembly implementation for roundeven which also works for
`rint`, similar to the existing `ceil` and `floor` implementations. This
resolves cases where values close to the *.5 boundary would round the
incorrect direction, such as -519629176421.49976 (tested in
`case_list`).
This annotates the `Rc` type with the diagnostic attribute
`#[diagnostic::on_move]`. Now when a moved `Rc` is borrowed,
a suggestion to clone it is made, with a label explaining why.
This fixes a stable-to-stable regression where constants of type
`ManuallyDrop<T>` would not be allowed to be used as a pattern due to
`MaybeDangling<T>` in `ManuallyDrop<T>` not implementing
`StructuralPartialEq`.