Commit Graph

1027 Commits

Author SHA1 Message Date
Guillaume Gomez aad3710227 Rollup merge of #153380 - pitaj:stabilize-new_range_from_api, r=tgross35
stabilize new RangeFrom type and iterator

```rust
// in core and std
pub mod range;

// in core::range

pub struct RangeFrom<Idx> {
    pub start: Idx,
}

impl<Idx: fmt::Debug> fmt::Debug for RangeFrom<Idx> { /* ... */ }

impl<Idx: PartialOrd<Idx>> RangeFrom<Idx> {
    pub const fn contains<U>(&self, item: &U) -> bool
    where
        Idx: [const] PartialOrd<U>,
        U: ?Sized + [const] PartialOrd<Idx>;
}

impl<Idx: Step> RangeFrom<Idx> {
    pub fn iter(&self) -> RangeFromIter<Idx>;
}

impl<T> const RangeBounds<T> for RangeFrom<T> { /* ... */ }
impl<T> const RangeBounds<T> for RangeFrom<&T> { /* ... */ }

impl<T> const From<RangeFrom<T>> for legacy::RangeFrom<T> { /* ... */ }
impl<T> const From<legacy::RangeFrom<T>> for RangeFrom<T> { /* ... */ }

pub struct RangeFromIter<A>(/* ... */);

// `RangeFromIter::remainder` left unstable

impl<A: Step> Iterator for RangeFromIter<A> {
    type Item = A;
    /* ... */
}

impl<A: Step> FusedIterator for RangeFromIter<A> { }
impl<A: Step> IntoIterator for RangeFrom<A> {
    type Item = A;
    type IntoIter = RangeFromIter<A>;
    /* ... */
}

unsafe impl<T> const SliceIndex<[T]> for range::RangeFrom<usize> {
    type Output = [T];
    /* ... */
}
unsafe impl const SliceIndex<str> for range::RangeFrom<usize> {
    type Output = str;
    /* ... */
}

impl ops::Index<range::RangeFrom<usize>> for CStr {
    type Output = CStr;
    /* ... */
}
```

Tracking issue: https://github.com/rust-lang/rust/issues/125687
2026-03-29 00:06:49 +01:00
Peter Jaszkowiak 085dff4944 stabilize new RangeFrom type and iterator
stabilizes `core::range::RangeFrom`
stabilizes `core::range::RangeFromIter`

add examples for `remainder` method on range iterators
`RangeFromIter::remainder` was not stabilized (see issue 154458)
2026-03-28 12:00:10 -06:00
Lars Schumann 6e2522e6e2 constify Step trait and all of its implementations 2026-03-13 11:54:06 +00:00
Josh Stone 78157ddde9 Replace version placeholders with 1.95.0
(cherry picked from commit bad24ccbec)
2026-03-07 10:42:01 -08:00
Peter Jaszkowiak bc4ceaddcd stabilize new RangeToInclusive type
stabilizes `core::range::RangeToInclusive`
add missing trait impls for new RangeToInclusive
add missing trait impls for new RangeFrom
2026-02-28 21:44:18 -07:00
Jonathan Brouwer 9f0a410096 Revert "Stabilize str_as_str" 2026-02-22 10:10:42 +01:00
Matthias Krüger a919df8b1b Rollup merge of #151603 - GrigorenkoPV:stabilize/str_as_str, r=jhpratt
Stabilize `str_as_str`

- Tracking issue: rust-lang/rust#130366
- Needs FCP
- `ByteStr` methods remain gated behind `bstr` feature gate (rust-lang/rust#134915)

Closes rust-lang/rust#130366
2026-02-21 13:03:28 +01:00
Karl Meakin f8cf68f35a Give into_range more consistent name
Rename `into_range` to `try_into_slice_range`:
- Prepend `try_` to show that it returns `None` on error, like `try_range`
- add `_slice` to make it consistent with `into_slice_range`
2026-02-10 23:19:01 +00:00
Karl Meakin 625b18027d Optimize SliceIndex::get impl for RangeInclusive
The checks for `self.end() == usize::MAX` and `self.end() + 1 > slice.len()`
can be replaced with `self.end() >= slice.len()`, since
`self.end() < slice.len()` implies both
`self.end() <= slice.len()` and
`self.end() < usize::MAX`.
2026-02-10 20:29:45 +00:00
Matthias Krüger 144b77aad2 Rollup merge of #151613 - cuviper:array-windows-parity, r=Amanieu
Align `ArrayWindows` trait impls with `Windows`

With `slice::ArrayWindows` getting ready to stabilize in 1.94, I noticed that it currently has some differences in trait implementations compared to `slice::Windows`, and I think we should align these.

- Remove `derive(Copy)` -- we generally don't want `Copy` for iterators at all, as this is seen as a footgun (e.g. rust-lang/rust#21809). This is obviously a breaking change though, so we should only remove this if we also backport the removal before it's stable. Otherwise, it should at least be replaced by a manual impl without requiring `T: Copy`.
- Manually `impl Clone`, simply to avoid requiring `T: Clone`.
- `impl FusedIterator`, because it is trivially so. The `since = "1.94.0"` assumes we'll backport this, otherwise we should change that to the "current" placeholder.
- `impl TrustedLen`, because we can trust our implementation.
- `impl TrustedRandomAccess`, because the required `__iterator_get_unchecked` method is straightforward.

r? libs-api

@rustbot label beta-nominated
(at least for the `Copy` removal, but we could be more selective about the rest).
2026-02-09 18:39:39 +01:00
Peter Jaszkowiak d2020fbf7c stabilize new inclusive range type and iter
stabilizes `core::range::RangeInclusive`
and `core::range::RangeInclusiveIter`
and the `core::range` module
2026-02-06 21:36:15 -07:00
Stuart Cook a57663ea65 Rollup merge of #151756 - Voultapher:fix-box-retag-in-sort, r=Mark-Simulacrum
Avoid miri error in `slice::sort` under Stacked Borrows

See comment in code.

Fixes: https://github.com/rust-lang/rust/issues/151728
2026-02-02 10:28:29 +11:00
Jacob Pratt 4e21b69e5a Rollup merge of #151812 - scottmcm:slice-shift, r=jhpratt
Add `shift_{left,right}` on slices

ACP approval: https://github.com/rust-lang/libs-team/issues/717#issuecomment-3807205664
cc tracking issue rust-lang/rust#151772
2026-01-30 22:52:21 -05:00
Jacob Pratt b60d3d9f1d Rollup merge of #151726 - scottmcm:delete-duplicated-code, r=jhpratt
Remove duplicated code in `slice/index.rs`

Looks like `const fn` is far enough along now that we can just not have these two copies any more, and have one call the other.
2026-01-30 22:52:20 -05:00
Scott McMurray a3f169c75b Use Bound::copied instead of Bound::cloned 2026-01-30 12:46:25 -08:00
Scott McMurray 4264da6869 Add shift_{left,right} on slices
cc tracking issue 151772
2026-01-29 11:12:54 -08:00
Stuart Cook d49f50ff4a Rollup merge of #151775 - calebzulawski:sync-from-portable-simd-2026-01-28, r=folkertdev
Portable SIMD subtree update

cc @folkertdev @programmerjake
2026-01-29 22:34:07 +11:00
Caleb Zulawski b71ff51277 Update std and tests to match std::simd API (remove LaneCount bound and rename to_int to to_simd) 2026-01-28 18:35:17 -05:00
bors 1e5065a4d9 Auto merge of #150945 - scottmcm:tweak-slice-partial-eq, r=Mark-Simulacrum
Tweak `SlicePartialEq` to allow MIR-inlining the `compare_bytes` call

rust-lang/rust#150265 disabled this because it was a net perf win, but let's see if we can tweak the structure of this to allow more inlining on this side while still not MIR-inlining the loop when it's not just `memcmp` and thus hopefully preserving the perf win.

This should also allow MIR-inlining the length check, which was previously blocked, and thus might allow some obvious non-matches to optimize away as well.
2026-01-28 14:31:41 +00:00
Lukas Bergdoll ce03e7b33a Avoid miri error in slice::sort under Stacked Borrows
See comment in code.

Fixes: https://github.com/rust-lang/rust/pull/131065
2026-01-28 14:55:29 +01:00
Scott McMurray 51de309db2 Tweak SlicePartialEq to allow MIR-inlining the compare_bytes call
150265 disabled this because it was a net perf win, but let's see if we can tweak the structure of this to allow more inlining on this side while still not MIR-inlining the loop when it's not just `memcmp`.

This should also allow MIR-inlining the length check, which was previously blocked.
2026-01-27 00:10:12 -08:00
Stuart Cook 1c892e829c Rollup merge of #147436 - okaneco:eq_ignore_ascii_autovec, r=scottmcm
slice/ascii: Optimize `eq_ignore_ascii_case` with auto-vectorization

- Refactor the current functionality into a helper function
- Use `as_chunks` to encourage auto-vectorization in the optimized chunk processing function
- Add a codegen test checking for vectorization and no panicking
- Add benches for `eq_ignore_ascii_case`

---

The optimized function is initially only enabled for x86_64 which has `sse2` as part of its baseline, but none of the code is platform specific. Other platforms with SIMD instructions may also benefit from this implementation.

Performance improvements only manifest for slices of 16 bytes or longer, so the optimized path is gated behind a length check for greater than or equal to 16.

Benchmarks - Cases below 16 bytes are unaffected, cases above all show sizeable improvements.
```
before:
    str::eq_ignore_ascii_case::bench_large_str_eq         4942.30ns/iter +/- 48.20
    str::eq_ignore_ascii_case::bench_medium_str_eq         632.01ns/iter +/- 16.87
    str::eq_ignore_ascii_case::bench_str_17_bytes_eq        16.28ns/iter  +/- 0.45
    str::eq_ignore_ascii_case::bench_str_31_bytes_eq        35.23ns/iter  +/- 2.28
    str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq       7.56ns/iter  +/- 0.22
    str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq    2.64ns/iter  +/- 0.06
after:
    str::eq_ignore_ascii_case::bench_large_str_eq         611.63ns/iter +/- 28.29
    str::eq_ignore_ascii_case::bench_medium_str_eq         77.10ns/iter +/- 19.76
    str::eq_ignore_ascii_case::bench_str_17_bytes_eq        3.49ns/iter  +/- 0.39
    str::eq_ignore_ascii_case::bench_str_31_bytes_eq        3.50ns/iter  +/- 0.27
    str::eq_ignore_ascii_case::bench_str_of_8_bytes_eq      7.27ns/iter  +/- 0.09
    str::eq_ignore_ascii_case::bench_str_under_8_bytes_eq   2.60ns/iter  +/- 0.05
```
2026-01-27 17:36:35 +11:00
Scott McMurray 50a9b17d7c Remove duplicated code in slice/index.rs
Looks like `const fn` is far enough along now that we can just not have these two copies any more, and have one call the other.
2026-01-26 22:07:44 -08:00
Stuart Cook 956ebbde20 Rollup merge of #151383 - cyrgani:no-internal-deprecation, r=scottmcm
remove `#[deprecated]` from unstable & internal `SipHasher13` and `24` types

These types are unstable and `doc(hidden)` (under the internal feature `hashmap_internals`). Deprecating them only adds noise (`#[allow(deprecated)]`) to all places where they are used, so this PR removes the deprecation attributes from them.

It also includes a few other small cleanups in separate commits, including one I overlooked in rust-lang/rust#151228.
2026-01-27 12:50:52 +11:00
Andreas Liljeqvist dbc870afec Mark is_ascii_sse2 as #[inline] 2026-01-25 20:05:08 +01:00
Andreas Liljeqvist a72f68e801 Fix is_ascii performance on x86_64 with explicit SSE2 intrinsics
Use explicit SSE2 intrinsics to avoid LLVM's broken AVX-512
auto-vectorization which generates ~31 kshiftrd instructions.

Performance
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

Improves on earlier pr
2026-01-24 22:03:58 +01:00
Josh Stone 129d552d3f impl TrustedRandomAccess for ArrayWindows 2026-01-24 11:19:34 -08:00
Josh Stone 8f7d556c75 impl TrustedLen for ArrayWindows 2026-01-24 11:13:51 -08:00
Josh Stone 08cd2ac33d impl FusedIterator for ArrayWindows 2026-01-24 11:10:40 -08:00
Josh Stone 90521553e6 Manually impl Clone for ArrayWindows
This implementation doesn't need the derived `T: Clone`.
2026-01-24 11:08:25 -08:00
Josh Stone 2bae85ec52 Remove derive(Copy) on ArrayWindows
The derived `T: Copy` constraint is not appropriate for an iterator by
reference, but we generally do not want `Copy` on iterators anyway.
2026-01-24 11:06:07 -08:00
Pavel Grigorenko 06398554bf Stabilize str_as_str 2026-01-24 21:32:31 +03:00
Jonathan Brouwer 13f0399a57 Rollup merge of #151259 - bonega:fix-is-ascii-avx512, r=folkertdev
Fix is_ascii performance regression on AVX-512 CPUs when compiling with -C target-cpu=native

## Summary

This PR fixes a severe performance regression in `slice::is_ascii` on AVX-512 CPUs when compiling with `-C target-cpu=native`.

On affected systems, the current implementation achieves only ~3 GB/s for large inputs, compared to ~60–70 GB/s previously (≈20–24× regression). This PR restores the original performance characteristics.

This change is intended as a **temporary workaround** for upstream LLVM poor codegen. Once the underlying LLVM issue is fixed and Rust is able to consume that fix, this workaround should be reverted.

  ## Problem

  When `is_ascii` is compiled with AVX-512 enabled, LLVM's auto-vectorization generates ~31 `kshiftrd` instructions to extract mask bits one-by-one, instead of using the efficient `pmovmskb`
  instruction. This causes a **~22x performance regression**.

  Because `is_ascii` is marked `#[inline]`, it gets inlined and recompiled with the user's target settings, affecting anyone using `-C target-cpu=native` on AVX-512 CPUs.

## Root cause (upstream)

The underlying issue appears to be an LLVM vectorizer/backend bug affecting certain AVX-512 patterns.

An upstream issue has been filed by @folkertdev  to track the root cause: llvm/llvm-project#176906

Until this is resolved in LLVM and picked up by rustc, this PR avoids triggering the problematic codegen pattern.

  ## Solution

  Replace the counting loop with explicit SSE2 intrinsics (`_mm_movemask_epi8`) that force `pmovmskb` codegen regardless of CPU features.

  ## Godbolt Links (Rust 1.92)

  | Pattern | Target | Link | Result |
  |---------|--------|------|--------|
  | Counting loop (old) | Default SSE2 | https://godbolt.org/z/sE86xz4fY | `pmovmskb` |
  | Counting loop (old) | AVX-512 (znver4) | https://godbolt.org/z/b3jvMhGd3 | 31x `kshiftrd` (broken) |
  | SSE2 intrinsics (fix) | Default SSE2 | https://godbolt.org/z/hMeGfeaPv | `pmovmskb` |
  | SSE2 intrinsics (fix) | AVX-512 (znver4) | https://godbolt.org/z/Tdvdqjohn | `vpmovmskb` (fixed) |

  ## Benchmark Results

  **CPU:** AMD Ryzen 5 7500F (Zen 4 with AVX-512)

  ### Default Target (SSE2) — Mixed

  | Size | Before | After | Change |
  |------|--------|-------|--------|
  | 4 B | 1.8 GB/s | 2.0 GB/s | **+11%** |
  | 8 B | 3.2 GB/s | 5.8 GB/s | **+81%** |
  | 16 B | 5.3 GB/s | 8.5 GB/s | **+60%** |
  | 32 B | 17.7 GB/s | 15.8 GB/s | -11% |
  | 64 B | 28.6 GB/s | 25.1 GB/s | -12% |
  | 256 B | 51.5 GB/s | 48.6 GB/s | ~same |
  | 1 KB | 64.9 GB/s | 60.7 GB/s | ~same |
  | 4 KB+ | ~68-70 GB/s | ~68-72 GB/s | ~same |

  ### Native Target (AVX-512) — Up to 24x Faster

  | Size | Before | After | Speedup |
  |------|--------|-------|---------|
  | 4 B | 1.2 GB/s | 2.0 GB/s | **1.7x** |
  | 8 B | 1.6 GB/s | 5.0 GB/s | **3.3x** |
  | 16 B | ~7 GB/s | ~7 GB/s | ~same |
  | 32 B | 2.9 GB/s | 14.2 GB/s | **4.9x** |
  | 64 B | 2.9 GB/s | 23.2 GB/s | **8x** |
  | 256 B | 2.9 GB/s | 47.2 GB/s | **16x** |
  | 1 KB | 2.8 GB/s | 60.0 GB/s | **21x** |
  | 4 KB+ | 2.9 GB/s | ~68-70 GB/s | **23-24x** |

  ### Summary

  - **SSE2 (default):** Small inputs (4-16 B) 11-81% faster; 32-64 B ~11% slower; large inputs unchanged
  - **AVX-512 (native):** 21-24x faster for inputs ≥1 KB, peak ~70 GB/s (was ~3 GB/s)

  Note: this is the pure ascii path, but the story is similar for the others.
  See linked bench project.

  ## Test Plan

  - [x] Assembly test (`slice-is-ascii-avx512.rs`) verifies no `kshiftrd` with AVX-512
  - [x] Existing codegen test updated to `loongarch64`-only (auto-vectorization still used there)
  - [x] Fuzz testing confirms old/new implementations produce identical results (~53M iterations)
  - [x] Benchmarks confirm performance improvement
  - [x] Tidy checks pass

  ## Reproduction / Test Projects

  Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

  - `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
  - `fuzz/` - Compares old/new implementations with libfuzzer

  ## Related Issues
  - issue opened by @folkertdev llvm/llvm-project#176906
  - Regression introduced in https://github.com/rust-lang/rust/pull/130733
2026-01-24 08:18:05 +01:00
Andreas Liljeqvist 890c0fd4e8 Make is_ascii_sse2 a safe function
Remove the `#[target_feature(enable = "sse2")]` attribute and make the
function safe to call. The SSE2 requirement is already enforced by the
`#[cfg(target_feature = "sse2")]` predicate.

Individual unsafe blocks are used for intrinsic calls with appropriate
SAFETY comments.

Also adds FIXME reference to llvm#176906 for tracking when this
workaround can be removed.
2026-01-22 22:41:57 +01:00
Mark Rousskov bc611ce5f1 Replace version placeholders with 1.94 2026-01-20 21:17:10 -05:00
cyrgani f548a19d49 remove reason = "newly added" from #[unstable(...)] 2026-01-19 21:22:14 +00:00
Andreas Liljeqvist 08432c8927 Optimize small input path for is_ascii on x86_64
For inputs smaller than 32 bytes, use usize-at-a-time processing
instead of calling the SSE2 function. This avoids function call
overhead from #[target_feature(enable = "sse2")] which prevents
inlining.

Also moves CHUNK_SIZE to module level so it can be shared between
is_ascii and is_ascii_sse2.
2026-01-18 22:49:37 +01:00
Andreas Liljeqvist a0f9a15b4a Fix is_ascii performance regression on AVX-512 CPUs
When `[u8]::is_ascii()` is compiled with `-C target-cpu=native` on
AVX-512 CPUs, LLVM generates inefficient code. Because `is_ascii` is
marked `#[inline]`, it gets inlined and recompiled with the user's
target settings. The previous implementation used a counting loop that
LLVM auto-vectorizes to `pmovmskb` on SSE2, but with AVX-512 enabled,
LLVM uses k-registers and extracts bits individually with ~31
`kshiftrd` instructions.

This fix replaces the counting loop with explicit SSE2 intrinsics
(`_mm_loadu_si128`, `_mm_or_si128`, `_mm_movemask_epi8`) for x86_64.
`_mm_movemask_epi8` compiles to `pmovmskb`, forcing efficient codegen
regardless of CPU features.

Benchmark results on AMD Ryzen 5 7500F (Zen 4 with AVX-512):
- Default build: ~73 GB/s → ~74 GB/s (no regression)
- With -C target-cpu=native: ~3 GB/s → ~67 GB/s (22x improvement)

The loongarch64 implementation retains the original counting loop
since it doesn't have this issue.

Regression from: https://github.com/rust-lang/rust/pull/130733
2026-01-17 17:38:51 +01:00
Matthias Krüger 807a5cefd2 Rollup merge of #147938 - const-clone-slice, r=tgross35
Add const cloning of slices and tests

As discussed in https://github.com/rust-lang/rust/pull/143628#discussion_r2390170336, splitting off slice cloning as a separate PR.
r? @tgross35
2026-01-12 00:02:52 +01:00
Urgau 65c0847f2d Rollup merge of #149318 - slice_partial_sort_unstable, r=tgross35
Implement partial_sort_unstable for slice

This refers to https://github.com/rust-lang/rust/issues/149046.
2026-01-09 23:28:15 +01:00
tison 45e0fbf7c5 Implement partial_sort_unstable for slice
Signed-off-by: tison <wander4096@gmail.com>
Co-Authored-By: Orson Peters <orsonpeters@gmail.com>
Signed-off-by: tison <wander4096@gmail.com>
2026-01-09 09:58:08 +08:00
Scott McMurray c48df5dcf1 Move the rustc_no_mir_inline down a level 2026-01-08 17:14:02 -08:00
Scott McMurray 5932078c79 Stop emitting UbChecks on every Vec→Slice
Spotted this in PR148766's test changes.  It doesn't seem like this ubcheck would catch anything useful; let's see if skipping it helps perf.
2026-01-08 17:14:02 -08:00
wr7 5548a84c87 Stabilize slice::element_offset 2026-01-07 10:57:20 -07:00
Marijn Schouten 16b219ab3b cleanup slice iter 2 2025-12-23 18:16:59 +00:00
Boxy Uwu 90a33f69f4 replace version placeholder 2025-12-19 15:04:30 -08:00
Jonathan Brouwer 663d8432f1 Rollup merge of #145933 - GrigorenkoPV:thing_as_thing, r=Amanieu
Expand `str_as_str` to more types

Tracking issue: rust-lang/rust#130366
ACP: https://github.com/rust-lang/libs-team/issues/643

This PR expands `str_from_str` feature and adds analogous methods to more types. Namely:
- `&CStr`
- `&[T]`, `&mut [T]`
- `&OsStr`
- `&Path`
- `&ByteStr`, `&mut ByteStr` (tracking issue:  rust-lang/rust#134915) (technically was not part of ACP)
2025-12-18 18:37:13 +01:00
Evgenii Zheltonozhskii c1472c573a Add const cloning of slices and tests 2025-12-12 06:31:00 +02:00
Matthias Krüger 8a6f82efac Rollup merge of #148814 - bend-n:stabilize_array_windows, r=scottmcm
stabilize `array_windows`

Tracking issue: rust-lang/rust#75027
Closes: rust-lang/rust#75027
FCP completed: https://github.com/rust-lang/rust/issues/75027#issuecomment-3477510526
2025-12-06 09:57:59 +01:00
Matthias Krüger 38d5d2877e Rollup merge of #146436 - hkBst:slice-iter-1, r=joboet
Slice iter cleanup
2025-12-02 22:02:28 +01:00