Rollup merge of #155856 - imazen:winarm-stable-features, r=Amanieu

std_detect: support detecting more features on aarch64 Windows

Wires `IsProcessorFeaturePresent` calls for the `PF_ARM_*` constants exposed in Windows SDK 26100 (Windows 11 24H2), plus an architectural derivation for `rdm`. All eight feature names have been stable in `is_aarch64_feature_detected!` on Linux/Darwin/BSD since Rust 1.60 — this brings the Windows backend to parity.

| Feature | Source on Windows |
|---|---|
| `fp16`  | `PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE`  (value 67) |
| `i8mm`  | `PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE`  (value 66) |
| `bf16`  | `PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE`  (value 68) |
| `sha3`  | `PF_ARM_SHA3` (value 64) **AND** `PF_ARM_SHA512` (value 65) |
| `lse2`  | `PF_ARM_LSE2_AVAILABLE`                   (value 62) |
| `f32mm` | `PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE` (value 58) |
| `f64mm` | `PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE` (value 59) |
| `rdm`   | derived from `PF_ARM_V82_DP` (see below) |

`PF_ARM_SVE_F32MM` / `PF_ARM_SVE_F64MM` (values 58 / 59) were already added as commented-out placeholders in rust-lang/stdarch#1749 — they have direct stable Feature mappings (`f32mm`, `f64mm`), unlike their sibling values 52 / 53 / 57 (`SVE_BF16`, `SVE_EBF16`, `SVE_I8MM`) which have no SVE-specific stdarch Feature name and remain commented for that reason.

`sha3` requires both `PF_ARM_SHA3` (FEAT_SHA3) and `PF_ARM_SHA512` (FEAT_SHA512), matching the existing convention from rust-lang/stdarch#1749 where `sve2-aes` is set only when both `PF_ARM_SVE_AES` and `PF_ARM_SVE_PMULL128` are present.

### `rdm` derivation

There is no `PF_ARM_RDM_*` constant; Microsoft has never defined one. We derive it from `PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE` (FEAT_DotProd) via the following architectural chain:

1. FEAT_DotProd is an optional v8.2-A feature, so its presence implies the core implements at least v8.1-A.
2. Per Arm ARM K.a §D17.2.91: *"In an ARMv8.1 implementation, if FEAT_AdvSIMD is implemented, FEAT_RDM is implemented."*
3. AdvSIMD is universally implemented on every Windows-on-ARM SKU.
4. Therefore: DotProd ⇒ v8.1-A baseline + AdvSIMD ⇒ FEAT_RDM.

This is the same derivation .NET 10 uses, with comment cited verbatim ([dotnet/runtime PR 109493](https://github.com/dotnet/runtime/pull/109493), shipped in v10.0.0 at [`src/native/minipal/cpufeatures.c`](https://github.com/dotnet/runtime/blob/v10.0.0/src/native/minipal/cpufeatures.c)):

> *"IsProcessorFeaturePresent does not have a dedicated flag for RDM, so we enable it by implication.
> 1) DP is an optional instruction set for Armv8.2, which may be included only in processors implementing at least Armv8.1.
> 2) Armv8.1 requires RDM when AdvSIMD is implemented, and AdvSIMD is a baseline requirement of .NET.
> Therefore, by documented standard, DP cannot exist here without RDM. In practice, there is only one CPU supported by Windows that includes RDM without DP, so this implication also has little practical chance of a false negative."*

The "one CPU with RDM without DP" trade-off applies equally to us: we accept a possible false negative on that single SKU rather than introducing a more aggressive heuristic.

### Tests

Adds `println!` lines to the existing `aarch64_windows()` test in `library/std_detect/tests/cpu-detection.rs` for each newly-detected feature, mirroring the existing single-line pattern. No structural assertions added.

### Scope

Stable feature names only. The unstable SME family (`sme`, `sme2`, `sme2p1`, `sme_*`, `ssve_fp8*`) and other unstable additions tracked under rust-lang/rust#127764 are intentionally out of scope here to keep this PR minimal — happy to do a follow-up.

### References

- Tracking issue: rust-lang/rust#127764 (`stdarch_aarch64_feature_detection`)
- Precedent: rust-lang/stdarch#1749 (taiki-e, merged 2025-03-24) — added the SVE constants this builds on
- MS docs: [`IsProcessorFeaturePresent`](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-isprocessorfeaturepresent) — full PF_ARM_* table

r? @Amanieu

cc @taiki-e (author of rust-lang/stdarch#1749, would appreciate your eyes on the `rdm` inference)

@rustbot label +T-libs +O-windows +O-ARM
This commit is contained in:
Jonathan Brouwer
2026-04-29 23:51:34 +02:00
committed by GitHub
2 changed files with 61 additions and 5 deletions
@@ -31,8 +31,14 @@ pub(crate) fn detect_features() -> cache::Initializer {
const PF_ARM_SVE_SHA3_INSTRUCTIONS_AVAILABLE: u32 = 55;
const PF_ARM_SVE_SM4_INSTRUCTIONS_AVAILABLE: u32 = 56;
// const PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE: u32 = 57;
// const PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE: u32 = 58;
// const PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE: u32 = 59;
const PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE: u32 = 58;
const PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE: u32 = 59;
const PF_ARM_LSE2_AVAILABLE: u32 = 62;
const PF_ARM_SHA3_INSTRUCTIONS_AVAILABLE: u32 = 64;
const PF_ARM_SHA512_INSTRUCTIONS_AVAILABLE: u32 = 65;
const PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE: u32 = 66;
const PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE: u32 = 67;
const PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE: u32 = 68;
unsafe extern "system" {
fn IsProcessorFeaturePresent(ProcessorFeature: DWORD) -> BOOL;
@@ -46,9 +52,11 @@ pub(crate) fn detect_features() -> cache::Initializer {
}
};
// Some features may be supported on current CPU,
// but no way to detect it by OS API.
// Also, we require unsafe block for the extern "system" calls.
// Some features may be supported on the current CPU but have no
// detection path through the Win32 API; those report `false`.
// SAFETY: `IsProcessorFeaturePresent` is a Win32 entry point taking a
// `DWORD` by value and returning a `BOOL`. No pointer parameters,
// no out-parameters, no thread-safety constraints.
unsafe {
enable_feature(
Feature::fp,
@@ -112,6 +120,46 @@ pub(crate) fn detect_features() -> cache::Initializer {
Feature::sve2_sm4,
IsProcessorFeaturePresent(PF_ARM_SVE_SM4_INSTRUCTIONS_AVAILABLE) != FALSE,
);
enable_feature(
Feature::f32mm,
IsProcessorFeaturePresent(PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE) != FALSE,
);
enable_feature(
Feature::f64mm,
IsProcessorFeaturePresent(PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE) != FALSE,
);
enable_feature(
Feature::lse2,
IsProcessorFeaturePresent(PF_ARM_LSE2_AVAILABLE) != FALSE,
);
enable_feature(
Feature::fp16,
IsProcessorFeaturePresent(PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE) != FALSE,
);
enable_feature(
Feature::i8mm,
IsProcessorFeaturePresent(PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE) != FALSE,
);
enable_feature(
Feature::bf16,
IsProcessorFeaturePresent(PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE) != FALSE,
);
// stdarch `sha3` is FEAT_SHA3 + FEAT_SHA512 together; Windows
// exposes them as two separate flags.
enable_feature(
Feature::sha3,
IsProcessorFeaturePresent(PF_ARM_SHA3_INSTRUCTIONS_AVAILABLE) != FALSE
&& IsProcessorFeaturePresent(PF_ARM_SHA512_INSTRUCTIONS_AVAILABLE) != FALSE,
);
// No PF_ARM_RDM_* constant exists. Derive FEAT_RDM from FEAT_DotProd:
// DotProd is an optional v8.2-A feature only present on cores that
// implement at least v8.1-A; v8.1-A with AdvSIMD mandates FEAT_RDM
// (Arm ARM K.a §D17.2.91), and AdvSIMD is universal on Windows-on-ARM.
// Same inference shipped in .NET 10 (dotnet/runtime PR 109493).
enable_feature(
Feature::rdm,
IsProcessorFeaturePresent(PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE) != FALSE,
);
// PF_ARM_V8_CRYPTO_INSTRUCTIONS_AVAILABLE means aes, sha1, sha2 and
// pmull support
let crypto =
@@ -150,14 +150,22 @@ fn aarch64_linux() {
fn aarch64_windows() {
println!("asimd: {:?}", is_aarch64_feature_detected!("asimd"));
println!("fp: {:?}", is_aarch64_feature_detected!("fp"));
println!("fp16: {:?}", is_aarch64_feature_detected!("fp16"));
println!("crc: {:?}", is_aarch64_feature_detected!("crc"));
println!("lse: {:?}", is_aarch64_feature_detected!("lse"));
println!("lse2: {:?}", is_aarch64_feature_detected!("lse2"));
println!("rdm: {:?}", is_aarch64_feature_detected!("rdm"));
println!("dotprod: {:?}", is_aarch64_feature_detected!("dotprod"));
println!("i8mm: {:?}", is_aarch64_feature_detected!("i8mm"));
println!("bf16: {:?}", is_aarch64_feature_detected!("bf16"));
println!("jsconv: {:?}", is_aarch64_feature_detected!("jsconv"));
println!("rcpc: {:?}", is_aarch64_feature_detected!("rcpc"));
println!("aes: {:?}", is_aarch64_feature_detected!("aes"));
println!("pmull: {:?}", is_aarch64_feature_detected!("pmull"));
println!("sha2: {:?}", is_aarch64_feature_detected!("sha2"));
println!("sha3: {:?}", is_aarch64_feature_detected!("sha3"));
println!("f32mm: {:?}", is_aarch64_feature_detected!("f32mm"));
println!("f64mm: {:?}", is_aarch64_feature_detected!("f64mm"));
}
#[test]