rust/tests/assembly-llvm at abcd22d5ed8b8efc6ce6928a852f8b7a2659c553 - rust

mirror of https://github.com/rust-lang/rust.git synced 2026-04-27 18:57:42 +03:00

Files

T

Stuart Cook a6e8a31b86 Rollup merge of #151611 - bonega:improve-is-slice-is-ascii-performance, r=folkertdev

Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics

# Summary

Improves `slice::is_ascii` performance for SSE2 target roughly 1.5-2x on larger inputs.
AVX-512 keeps similiar performance characteristics.

This is building on the work already merged in rust-lang/rust#151259.
In particular this PR improves the default SSE2 performance, I don't consider this a temporary fix anymore.
Thanks to @folkertdev for pointing me to consider `as_chunk` again.

# The implementation:
- Uses 64-byte chunks with 4x 16-byte SSE2 loads OR'd together
- Extracts the MSB mask with a single `pmovmskb` instruction
- Falls back to usize-at-a-time SWAR for inputs < 64 bytes

# Performance impact (vs before rust-lang/rust#151259):
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

  <details>
  <summary>Benchmark Results (click to expand)</summary>

  Benchmarked on AMD Ryzen 9 9950X (AVX-512 capable). Values show relative performance (1.00 = fastest).
  Tops out at 139GB/s for large inputs.

  ### early_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | 1.01 | **1.00** | 13.45 | 1.13 |
  | 1024 | 1.01 | **1.00** | 13.53 | 1.14 |
  | 65536 | 1.01 | **1.00** | 13.99 | 1.12 |
  | 1048576 | 1.02 | **1.00** | 13.29 | 1.12 |

  ### late_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | **1.00** | 1.01 | 13.37 | 1.13 |
  | 1024 | 1.10 | **1.00** | 42.42 | 1.95 |
  | 65536 | **1.00** | 1.06 | 42.22 | 1.73 |
  | 1048576 | **1.00** | 1.03 | 34.73 | 1.46 |

  ### pure_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 4 | 1.03 | **1.00** | 1.75 | 1.32 |
  | 8 | **1.00** | 1.14 | 3.89 | 2.06 |
  | 16 | **1.00** | 1.04 | 1.13 | 1.62 |
  | 32 | 1.07 | 1.19 | 5.11 | **1.00** |
  | 64 | **1.00** | 1.13 | 13.32 | 1.57 |
  | 128 | **1.00** | 1.01 | 19.97 | 1.55 |
  | 256 | **1.00** | 1.02 | 27.77 | 1.61 |
  | 1024 | **1.00** | 1.02 | 41.34 | 1.84 |
  | 4096 | 1.02 | **1.00** | 45.61 | 1.98 |
  | 16384 | 1.01 | **1.00** | 48.67 | 2.04 |
  | 65536 | **1.00** | 1.03 | 43.86 | 1.77 |
  | 262144 | **1.00** | 1.06 | 41.44 | 1.79 |
  | 1048576 | 1.02 | **1.00** | 35.36 | 1.44 |

  </details>

## Reproduction / Test Projects

Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

- `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
- `fuzz/` - Compares old/new implementations with libfuzzer

Relates to: https://github.com/llvm/llvm-project/issues/176906

2026-01-26 14:36:21 +11:00

asm

s390x: support f16 and f16x8 in inline assembly

2026-01-09 18:42:46 +01:00

auxiliary

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

compiletest-self-test

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

libs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

naked-functions

naked functions: emit .private_extern on macos

2026-01-06 16:48:04 +01:00

nvptx-kernel-abi

Nvptx: Use llbc as default linker

2025-12-19 21:39:48 +01:00

sanitizer/kcfi

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

simd

Ignore intrinsic calls in cross-crate-inlining cost model

2025-09-05 20:44:49 -04:00

stack-protector

Rollup merge of #148849 - saethlin:windows-stack-protectors, r=wesleywiser

2025-12-18 18:37:14 +01:00

targets

Add ARMv6 bare-metal targets

2026-01-24 17:29:25 +00:00

aarch64-pointer-auth.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

aarch64-xray.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

align_offset.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

breakpoint.rs

Ignore intrinsic calls in cross-crate-inlining cost model

2025-09-05 20:44:49 -04:00

c-variadic-arm.rs

implement va_arg for arm in rustc itself

2025-09-08 13:46:28 +02:00

closure-inherit-target-feature.rs

sess: default to v0 symbol mangling

2025-11-19 11:55:09 +00:00

cmse.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

cstring-merging.rs

Fix cstring-merging test for Hexagon target

2026-01-23 23:45:36 -06:00

dwarf4.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

dwarf5.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

dwarf-mixed-versions-lto.rs

Fix tests/assembly-llvm/dwarf-mixed-versions-lto.rs test failure on riscv64

2025-07-23 11:14:07 +00:00

emit-intel-att-syntax.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

force-target-feature.rs

Add an experimental unsafe(force_target_feature) attribute.

2025-08-22 01:26:26 +02:00

is_aligned.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

issue-83585-small-pod-struct-equality.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

large_data_threshold.rs

Add -Z large-data-threshold

2026-01-07 11:57:48 -08:00

loongarch-float-struct-abi.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

manual-eq-efficient.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

niche-prefer-zero.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-arch-default.rs

Nvptx: Use llbc as default linker

2025-12-19 21:39:48 +01:00

nvptx-arch-emit-asm.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-arch-link-arg.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-arch-target-cpu.rs

Nvptx: Use llbc as default linker

2025-12-19 21:39:48 +01:00

nvptx-atomics.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-c-abi-arg-v7.rs

Nvptx: Use llbc as default linker

2025-12-19 21:39:48 +01:00

nvptx-c-abi-ret-v7.rs

Nvptx: Use llbc as default linker

2025-12-19 21:39:48 +01:00

nvptx-internalizing.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-linking-binary.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-linking-cdylib.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

nvptx-safe-naming.rs

Nvptx: Use llbc as default linker

2025-12-19 21:39:48 +01:00

panic-no-unwind-no-uwtable.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

panic-unwind-no-uwtable.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

pic-relocation-model.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

pie-relocation-model.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

powerpc64-struct-abi.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

reg-struct-return.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

regparm-module-flag.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

riscv-float-struct-abi.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

riscv-soft-abi-with-float-features.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

rust-abi-arg-attr.rs

adding minicore to test file to avoid duplicating lang error

2026-01-09 02:30:33 +00:00

s390x-backchain-toggle.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

s390x-vector-abi.rs

stabilize s390x_target_feature_vector

2025-11-06 12:49:48 +01:00

simd-bitmask.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

simd-intrinsic-gather.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

simd-intrinsic-mask-load.rs

Add alignment parameter to simd_masked_{load,store}

2025-11-04 02:30:59 +05:30

simd-intrinsic-mask-reduce.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

simd-intrinsic-mask-store.rs

Add alignment parameter to simd_masked_{load,store}

2025-11-04 02:30:59 +05:30

simd-intrinsic-scatter.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

simd-intrinsic-select.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

slice-is_ascii.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

slice-is-ascii.rs

Mark is_ascii_sse2 as #[inline]

2026-01-25 20:05:08 +01:00

slp-vectorize-closure.rs

Add regression test for closure loop vectorization

2025-12-09 23:02:14 +09:00

small_data_threshold.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

sparc-struct-abi.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

stack-probes.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

static-relocation-model.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

strict_provenance.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

tail-call-infinite-recursion.rs

add assembly test for infinite recursion with become

2025-11-13 16:57:02 +01:00

target-feature-multiple.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

wasm_exceptions.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86_64-array-pair-load-store-merge.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86_64-bigint-helpers.rs

x86_64-bigint-helpers test: update test assertion

2025-10-09 12:28:06 +00:00

x86_64-cmp.rs

Merge similar output checks in assembly-llvm/x86_64-cmp

2025-09-16 11:49:21 -07:00

x86_64-floating-point-clamp.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86_64-fortanix-unknown-sgx-lvi-generic-load.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

x86_64-fortanix-unknown-sgx-lvi-generic-ret.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

x86_64-fortanix-unknown-sgx-lvi-inline-assembly.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

x86_64-function-return.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86_64-indirect-branch-cs-prefix.rs

Add -Zindirect-branch-cs-prefix option

2025-08-17 16:51:42 +02:00

x86_64-mcount.rs

Test instrument-mcount

2025-08-26 13:44:00 +00:00

x86_64-no-jump-tables.rs

Stabilize -Zjump-tables=<bool> into -Cjump-table=<bool>

2025-11-03 08:12:16 -06:00

x86_64-sse_crc.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86_64-typed-swap.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86_64-windows-float-abi.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

x86_64-windows-i128-abi.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00

x86_64-xray.rs

Rename tests/assembly into tests/assembly-llvm

2025-07-22 14:27:48 +02:00

x86-return-float.rs

compiletest: rename add-core-stubs to add-minicore

2025-11-02 16:20:06 +01:00