131 Commits

Author SHA1 Message Date
Jonathan Brouwer 3fb712c80a Rollup merge of #154719 - androm3da:hexagon-inline-asm-register-classes, r=JohnTitor
Hexagon inline asm: add reg_pair, vreg, vreg_pair, and qreg register classes

Add three new register classes for the Hexagon inline assembly backend:

* `reg_pair`: GPR double registers (r1:0 through r27:26)
* `vreg`: HVX vector registers (v0-v31)
* `qreg`: HVX predicate registers (q0-q3), clobber-only for now
2026-04-08 23:04:34 +02:00
Jonathan Brouwer b040d5493e Rollup merge of #154598 - folkertdev:windows-naked-link-section, r=mati865
test `#[naked]` with `#[link_section = "..."]` on windows

As a part of https://github.com/rust-lang/rust/pull/147811 I ran into that we actually don't match (current) LLVM output.

r? @mati865
2026-04-08 23:04:33 +02:00
Jonathan Brouwer 3e027ff0f0 Rollup merge of #154184 - folkertdev:stabilize-s390x-vector-registers, r=Amanieu
stabilize s390x vector registers

tracking issue: https://github.com/rust-lang/rust/issues/133416
reference PR: https://github.com/rust-lang/reference/pull/2215

Stabilizes s390x vector registers, e.g.

```rust
unsafe fn vreg_128(x: i128) -> i128 {
    let y;
    asm!("vlr {}, {}", out(vreg) y, in(vreg) x);
    y
}
```

The types that are accepted for vreg registers are

- all float types `f16`,  `f32`, `f64`, `f128`
- integer types `i32`, `i64` and `i128` and their unsigned counterparts
- integer vector types `i8x16`, `i16x8`, `i32x4`, `i64x2` and their unsigned counterparts
- float vector types `f16x8`, `f32x4` and `f64x2`

Support for all of these is tested in https://github.com/rust-lang/rust/blob/main/tests/assembly-llvm/asm/s390x-types.rs, and the types correspond with the LLVM definition in https://github.com/llvm/llvm-project/blob/df9eb79970c012990e829d174d181d575d414efe/llvm/lib/Target/SystemZ/SystemZRegisterInfo.td#L312-L339

The `f16`, `f16x8` and `f128` types are unstable, and so can't be used on stable in practice. They do show up in some error messages though.

`vreg` was previously only accepted as a clobber.

---

Currently the vector types in `core::arch::s390x` are still unstable. Separately stabilizing `vreg` is still useful because scalar types can also be put into `vreg`s.

## Implementation history

- https://github.com/rust-lang/rust/pull/131664
- https://github.com/rust-lang/rust/pull/150826

cc @uweigand @taiki-e
r? @Amanieu
2026-04-08 14:21:58 +02:00
Brian Cain aa9da4b859 Hexagon inline asm: add reg_pair, vreg, vreg_pair, and qreg register classes
Add new Hexagon inline asm register classes:
- reg_pair: GPR double registers (r1:0 through r27:26) for i64/f64 types
- vreg: HVX vector registers (v0-v31) for mode-dependent vector types
- vreg_pair: HVX vector pair registers (v1:0 through v31:30) for vector pairs
- qreg: HVX predicate registers (q0-q3), clobber-only

Key implementation details:
- GPR pairs use LLVM's 'd' register naming (d0-d13) for constraints
- HVX vector pairs use LLVM's 'w' register naming (w0-w15) for constraints
- Register overlap tracking for GPR pair<->single and HVX pair<->single conflicts
- HVX vector types are mode-dependent (64B vs 128B HVX length)

Note: vreg_quad (HVX vector quads) is not supported as LLVM's Hexagon
backend does not support vector quad types in inline asm constraints.
2026-04-07 09:27:32 -07:00
Folkert de Vries 72b6825828 test #[naked] with #[link_section = "..."] on windows 2026-04-06 14:58:52 +02:00
Kcang-gna 519a5559ff add regression test for Redundant memory strores with mut parameters in by-value returns 2026-04-05 19:40:48 +08:00
Eddy (Eduard) Stefes f39fa9e4c0 add rustc option -Zpacked-stack
this enables packed-stack just as -mpacked-stack in clang and gcc.
packed-stack is needed on s390x for kernel development.

Co-authored-by: Ralf Jung <post@ralfj.de>
2026-03-31 09:06:31 +02:00
Jonathan Brouwer 37bc3bcaaa Rollup merge of #152757 - jakos-sec:add-msan-tsan, r=davidtwco
Add x86_64-unknown-linux-gnu{m,t}san target which enables {M,T}San by default

Analogous to the ASan target (https://github.com/rust-lang/rust/pull/149644), this adds targets for MSan and TSan.

As suggested, in order to distribute sanitizer instrumented standard libraries without introducing new rustc flags, this adds a new dedicated target. With the target, we can distribute the instrumented standard libraries through a separate rustup component.

> A tier 2 target must have value to people other than its maintainers. (It may still be a niche target, but it must not be exclusively useful for an inherently closed group.)

The target is useful to anyone who wants to use MSan/TSan with a stable compiler or the ease to not have to recompiled all standard libraries for full coverage.

> A tier 2 target must have a designated team of developers (the “target maintainers”) available to consult on target-specific build-breaking issues, or if necessary to develop target-specific language or library implementation details. This team must have at least 2 developers.
> * The target maintainers should not only fix target-specific issues, but should use any such issue as an opportunity to educate the Rust community about portability to their target, and enhance documentation of the target.

I pledge myself and the folks from the Exploit Mitigations Project Group (rcvalle@ & 1c3t3a@) as target maintainers to fix target-specific issues and educate the Rust community about their use.

> The target must not place undue burden on Rust developers not specifically concerned with that target. Rust developers are expected to not gratuitously break a tier 2 target, but are not expected to become experts in every tier 2 target, and are not expected to provide target-specific implementations for every tier 2 target.

Understood. The target should not have negative impact for anyone not using it.

> The target must provide documentation for the Rust community explaining how to build for the target using cross-compilation, and explaining how to run tests for the target. If at all possible, this documentation should show how to run Rust programs and tests for the target using emulation, to allow anyone to do so. If the target cannot be feasibly emulated, the documentation should explain how to obtain and work with physical hardware, cloud systems, or equivalent.

`src/doc/rustc/src/platform-support/x86_64-unknown-linux-gnu{m,t}san.md` should provide the necessary documentation on how to build the target or compile programs with it. In the way the target can be emulated it should not differ from the tier 1 target `x86_64-unknown-linux-gnu`.

> The target must document its baseline expectations for the features or versions of CPUs, operating systems, libraries, runtime environments, and similar.

The baseline expectation mirror `x86_64-unknown-linux-gnu`.

> If introducing a new tier 2 or higher target that is identical to an existing Rust target except for the baseline expectations for the features or versions of CPUs, operating systems, libraries, runtime environments, and similar, then the proposed target must document to the satisfaction of the approving teams why the specific difference in baseline expectations provides sufficient value to justify a separate target.
> * Note that in some cases, based on the usage of existing targets within the Rust community, Rust developers or a target’s maintainers may wish to modify the baseline expectations of a target, or split an existing target into multiple targets with different baseline expectations. A proposal to do so will be treated similarly to the analogous promotion, demotion, or removal of a target, according to this policy, with the same team approvals required.
>     * For instance, if an OS version has become obsolete and unsupported, a target for that OS may raise its baseline expectations for OS version (treated as though removing a target corresponding to the older versions), or a target for that OS may split out support for older OS versions into a lower-tier target (treated as though demoting a target corresponding to the older versions, and requiring justification for a new target at a lower tier for the older OS versions).

This has been outlined sufficiently. We should not enable MSan/TSan in the default target and are therefore creating a new tier 2 target to bridge the gap until `build-std` stabilized.

> Tier 2 targets must not leave any significant portions of core or the standard library unimplemented or stubbed out, unless they cannot possibly be supported on the target.
> * The right approach to handling a missing feature from a target may depend on whether the target seems likely to develop the feature in the future. In some cases, a target may be co-developed along with Rust support, and Rust may gain new features on the target as that target gains the capabilities to support those features.
> * As an exception, a target identical to an existing tier 1 target except for lower baseline expectations for the OS, CPU, or similar, may propose to qualify as tier 2 (but not higher) without support for std if the target will primarily be used in no_std applications, to reduce the support burden for the standard library. In this case, evaluation of the proposed target’s value will take this limitation into account.

All of std that is supported by `x86_64-unknown-linux-gnu` is also supported.

> The code generation backend for the target should not have deficiencies that invalidate Rust safety properties, as evaluated by the Rust compiler team. (This requirement does not apply to arbitrary security enhancements or mitigations provided by code generation backends, only to those properties needed to ensure safe Rust code cannot cause undefined behavior or other unsoundness.) If this requirement does not hold, the target must clearly and prominently document any such limitations as part of the target’s entry in the target tier list, and ideally also via a failing test in the testsuite. The Rust compiler team must be satisfied with the balance between these limitations and the difficulty of implementing the necessary features.
> * For example, if Rust relies on a specific code generation feature to ensure that safe code cannot overflow the stack, the code generation for the target should support that feature.
> * If the Rust compiler introduces new safety properties (such as via new capabilities of a compiler backend), the Rust compiler team will determine if they consider those new safety properties a best-effort improvement for specific targets, or a required property for all Rust targets. In the latter case, the compiler team may require the maintainers of existing targets to either implement and confirm support for the property or update the target tier list with documentation of the missing property.

The entire point is to have more security instead of less ;) The safety properties provided are already present in the compiler, just not enabled by default.

> If the target supports C code, and the target has an interoperable calling convention for C code, the Rust target must support that C calling convention for the platform via extern "C". The C calling convention does not need to be the default Rust calling convention for the target, however.

Understood.

> The target must build reliably in CI, for all components that Rust’s CI considers mandatory.

Understood and the reason for introducing the tier 2 target.

> The approving teams may additionally require that a subset of tests pass in CI, such as enough to build a functional “hello world” program, ./x.py test --no-run, or equivalent “smoke tests”. In particular, this requirement may apply if the target builds host tools, or if the tests in question provide substantial value via early detection of critical problems.

Understood.

> Building the target in CI must not take substantially longer than the current slowest target in CI, and should not substantially raise the maintenance burden of the CI infrastructure. This requirement is subjective, to be evaluated by the infrastructure team, and will take the community importance of the target into account.

Understood.

> Tier 2 targets should, if at all possible, support cross-compiling. Tier 2 targets should not require using the target as the host for builds, even if the target supports host tools.

Understood. No need to use this target as the host (no benefit of having MSan/TSan enabled for compiling).

> In addition to the legal requirements for all targets (specified in the tier 3 requirements), because a tier 2 target typically involves the Rust project building and supplying various compiled binaries, incorporating the target and redistributing any resulting compiled binaries (e.g. built libraries, host tools if any) must not impose any onerous license requirements on any members of the Rust project, including infrastructure team members and those operating CI systems. This is a subjective requirement, to be evaluated by the approving teams.
> * As an exception to this, if the target’s primary purpose is to build components for a Free and Open Source Software (FOSS) project licensed under “copyleft” terms (terms which require licensing other code under compatible FOSS terms), such as kernel modules or plugins, then the standard libraries for the target may potentially be subject to copyleft terms, as long as such terms are satisfied by Rust’s existing practices of providing full corresponding source code. Note that anything added to the Rust repository itself must still use Rust’s standard license terms.

Understood, no legal differences between this target and `x86_64-unknown-linux-gnu`.

> Tier 2 targets must not impose burden on the authors of pull requests, or other developers in the community, to ensure that tests pass for the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on tests failing for the target. Do not send automated messages or notifications (via any medium, including via @) to a PR author or others involved with a PR regarding the PR breaking tests on a tier 2 target, unless they have opted into such messages.
> * Backlinks such as those generated by the issue/PR tracker when linking to an issue or PR are not considered a violation of this policy, within reason. However, such messages (even on a separate repository) must not generate notifications to anyone involved with a PR who has not requested such notifications.

Understood.

> The target maintainers should regularly run the testsuite for the target, and should fix any test failures in a reasonably timely fashion.

Understood.

> All requirements for tier 3 apply.

Requirements for tier 3 are listed below.

> A tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)

I pledge to do my best maintaining it and we can also include the folks from the Exploit Mitigations Project Group (rcvalle@ & 1c3t3a@).

> Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.

We've chosen `x86_64-unknown-linux-gnu{m,t}san` as the name which was suggested on [#t-compiler/major changes > Create new Tier 2 targets with sanitizers… compiler-team#951 @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/233931-t-compiler.2Fmajor-changes/topic/Create.20new.20Tier.202.20targets.20with.20sanitizers.E2.80.A6.20compiler-team.23951/near/564482315). We've merged `x86_64-unknown-linux-gnuasan` and are now following up with the MSan and TSan targets

> Target names should not introduce undue confusion or ambiguity unless absolutely necessary to maintain ecosystem compatibility. For example, if the name of the target makes people extremely likely to form incorrect beliefs about what it targets, the name should be changed or augmented to disambiguate it.

There should be no confusion, it's clear that it's the original target with MSan/TSan enabled.

> If possible, use only letters, numbers, dashes and underscores for the name. Periods (.) are known to cause issues in Cargo.

Only letters, numbers and dashes used.

> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.

There are no unusual requirements to build or use it. It's the original `x86_64-unknown-linux-gnu` target with MSan/TSan enabled as a default sanitizer.

> The target must not introduce license incompatibilities.

There are no license implications.

> Anything added to the Rust repository must be under the standard Rust license (MIT OR Apache-2.0).

Given, by reusing the existing MSan/TSan code.

> The target must not cause the Rust tools or libraries built for any other host (even when supporting cross-compilation to the target) to depend on any new dependency less permissive than the Rust licensing policy. This applies whether the dependency is a Rust crate that would require adding new license exceptions (as specified by the tidy tool in the rust-lang/rust repository), or whether the dependency is a native library or binary. In other words, the introduction of the target must not cause a user installing or running a version of Rust or the Rust tools to be subject to any new license requirements.

There are no new dependencies/features required.

> Compiling, linking, and emitting functional binaries, libraries, or other code for the target (whether hosted on the target itself or cross-compiling from another target) must not depend on proprietary (non-FOSS) libraries. Host tools built for the target itself may depend on the ordinary runtime libraries supplied by the platform and commonly used by other applications built for the target, but those libraries must not be required for code generation for the target; cross-compilation to the target must not require such libraries at all. For instance, rustc built for the target may depend on a common proprietary C runtime library or console output library, but must not depend on a proprietary code generation library or code optimization library. Rust's license permits such combinations, but the Rust project has no interest in maintaining such combinations within the scope of Rust itself, even at tier 3.

It's using open source tools only.

> "onerous" here is an intentionally subjective term. At a minimum, "onerous" legal/licensing terms include but are not limited to: non-disclosure requirements, non-compete requirements, contributor license agreements (CLAs) or equivalent, "non-commercial"/"research-only"/etc terms, requirements conditional on the employer or employment of any particular Rust developers, revocable terms, any requirements that create liability for the Rust project or its developers or users, or any requirements that adversely affect the livelihood or prospects of the Rust project or its developers or users.

There are no such terms present.

> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.

Understood.

> This requirement does not prevent part or all of this policy from being cited in an explicit contract or work agreement (e.g. to implement or maintain support for a target). This requirement exists to ensure that a developer or team responsible for reviewing and approving a target does not face any legal threats or obligations that would prevent them from freely exercising their judgment in such approval, even if such judgment involves subjective matters or goes beyond the letter of these requirements.

Understood.

> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate (core for most targets, alloc for targets that can support dynamic memory allocation, std for targets with an operating system or equivalent layer of system-provided functionality), but may leave some code unimplemented (either unavailable or stubbed out as appropriate), whether because the target makes it impossible to implement or challenging to implement. The authors of pull requests are not obligated to avoid calling any portions of the standard library on the basis of a tier 3 target not implementing those portions.

The goal is to have MSan/TSan instrumented standard library variants of the existing `x86_64-unknown-linux-gnu` target, so all should be present.

> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible. If the target supports running binaries, or running tests (even if they do not pass), the documentation must explain how to run such binaries or tests for the target, using emulation if possible or dedicated hardware if necessary.

I think the explanation in platform support doc is enough to make this aspect clear.

> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via @) to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.
Backlinks such as those generated by the issue/PR tracker when linking to an issue or PR are not considered a violation of this policy, within reason. However, such messages (even on a separate repository) must not generate notifications to anyone involved with a PR who has not requested such notifications.

Understood.

> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.

Understood.

> In particular, this may come up when working on closely related targets, such as variations of the same architecture with different features. Avoid introducing unconditional uses of features that another variation of the target may not have; use conditional compilation or runtime detection, as appropriate, to let each target run code supported by that target.

I don't believe this PR is affected by this.

> Tier 3 targets must be able to produce assembly using at least one of rustc's supported backends from any host target. (Having support in a fork of the backend is not sufficient, it must be upstream.)

The target should work on all rustc versions that correctly compile for `x86_64-unknown-linux-gnu`.
2026-03-26 15:20:09 +01:00
Jonathan Brouwer 0cd8de3843 Rollup merge of #153049 - Darksonn:kasan-sw-tags, r=fmease
Add `-Zsanitize=kernel-hwaddress`

The Linux kernel has a config option called `CONFIG_KASAN_SW_TAGS`  that enables `-fsanitize=kernel-hwaddress`. This is not supported by Rust.

One slightly awkward detail is that `#[sanitize(address = "off")]` applies to both `-Zsanitize=address` and `-Zsanitize=kernel-address`. Probably it was done this way because both are the same LLVM pass. I replicated this logic here for hwaddress, but it might be undesirable.

Note that `#[sanitize(kernel_hwaddress = "off")]` could be supported as an annotation on statics, but since it's also missing for `#[sanitize(hwaddress = "off")]`, I did not add it.

MCP: https://github.com/rust-lang/compiler-team/issues/975
Tracking issue: https://github.com/rust-lang/rust/issues/154171

cc @rcvalle @maurer @ojeda
2026-03-25 19:52:49 +01:00
Jonathan Brouwer 898b62f590 Rollup merge of #154094 - folkertdev:aarch64-arm-load-store, r=sayantn
add neon load/store assembly test

I'm adding this test because it was requested for the beta backport of https://github.com/rust-lang/rust/issues/153336. We'd like to test this with Miri, but currently there is no load/store pair that roundtrips because one or the other still uses the platform-specific intrinsics.

r? sayantn

I believe test-various runs some arm and android tests?

@bors try job=test-various
2026-03-24 16:22:49 +01:00
Jonathan Brouwer 8b69918e72 Rollup merge of #153069 - blueshift-gg:BPF_unaligned, r=chenyukang
[BPF] add target feature allows-misaligned-mem-access

This PR adds the allows-misaligned-mem-access target feature to the BPF target. The feature can enable misaligned memory access support in the LLVM backend, aligning Rust’s BPF target behavior with the corresponding LLVM update introduced in [llvm/llvm-project#167013](https://github.com/llvm/llvm-project/pull/167013) (included in LLVM 22).
2026-03-23 12:00:58 +01:00
Folkert de Vries 35f9cb6eee add neon load/store assembly test 2026-03-22 18:39:15 +01:00
Folkert de Vries e182fef7ad stabilize s390x vector registers 2026-03-21 20:58:15 +01:00
Stuart Cook 845fd53500 Rollup merge of #152909 - davidtwco:branch-protection-target-modifier, r=jackh726
sess: `-Zbranch-protection` is a target modifier

`-Zbranch-protection` only makes sense if the entire crate graph has the option set, otherwise the security properties that branch protection provides won't be effective - hence a target modifier. This flag is unstable so I don't think this warrants an MCP.
2026-03-20 15:33:04 +11:00
Alice Ryhl c679e3daf2 Simplify tests and fix test tidy issue 2026-03-17 20:24:05 +00:00
Alice Ryhl a197752e88 Add kernel-hwaddress sanitizer
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
2026-03-17 20:23:59 +00:00
Folkert de Vries 19e9ec7560 test classify-runtime-const for f16 2026-03-17 00:52:02 +01:00
Josh Stone 52dfa94cdc Update the minimum external LLVM to 21 2026-03-12 16:45:42 -07:00
Jakob Koschel de351814b3 Create x86_64-unknown-linux-gnu{m,t}san target which enables {M,T}SAN by default
Similar like we've done it for `x86_64-unknown-linux-gnuasan`, in order
to distribute sanitizer instrumented standard libraries without
introducing new rustc flags, this adds a new dedicated target. With the
target, we can distribute the instrumented standard libraries through
a separate rustup component.
2026-03-10 13:57:40 +00:00
David Wood cca28656e5 sess: -Zbranch-protection is a target modifier
`-Zbranch-protection` only makes sense if the entire crate graph has
the option set, otherwise the security properties that branch protection
provides won't be effective. This flag is unstable so I don't think this
warrants an MCP.
2026-03-10 12:37:16 +00:00
Folkert de Vries e6cf5a22e7 test u128 passing on linux and windows 2026-02-27 10:51:55 +01:00
Folkert de Vries 31ae3d2be8 guaranteed tail calls: support indirect arguments 2026-02-27 10:24:39 +01:00
Claire Fan f62b2f3b4d [BPF] add target feature allows-misaligned-mem-access 2026-02-25 21:31:51 +08:00
Jonathan Brouwer e6ca590153 Rollup merge of #152404 - durin42:llvm-23-instcombine-shrink-constant, r=Mark-Simulacrum
tests: adapt align-offset.rs for InstCombine improvements in LLVM 23

Upstream [has improved InstCombine](https://github.com/llvm/llvm-project/commit/8d2078332c23b10dcf3571adc1a186e5c65f82df) so that it can shrink added constants using known zeroes, which caused a little bit of change in this test. As far as I can tell either output is fine, so we just accept both.

@rustbot label: +llvm-main
2026-02-14 22:11:54 +01:00
Jacob Pratt b1b6533077 Rollup merge of #142680 - beetrees:sparc64-float-struct-abi, r=tgross35
Fix passing/returning structs with the 64-bit SPARC ABI

Fixes the 64-bit SPARC part of rust-lang/rust#115609 by replacing the current implementation with a new implementation modelled on the RISC-V calling convention code ([SPARC ABI reference](https://sparc.org/wp-content/uploads/2014/01/SCD.2.4.1.pdf.gz)).

Pinging `sparcv9-sun-solaris` target maintainers: @psumbera @kulikjak
Fixes rust-lang/rust#115336
Fixes rust-lang/rust#115399
Fixes rust-lang/rust#122620
Fixes https://github.com/rust-lang/rust/issues/147883
r? @workingjubilee
2026-02-12 00:41:05 -05:00
Folkert de Vries c9b5c934ca Fix passing/returning structs with the 64-bit SPARC ABI
Co-authored-by: beetrees <b@beetr.ee>
2026-02-10 12:39:45 +01:00
Augie Fackler aefb9a9ae2 tests: adapt align-offset.rs for InstCombine improvements in LLVM 23
Upstream has improved InstCombine so that it can shrink added constants
using known zeroes, which caused a little bit of change in this test. As
far as I can tell either output is fine, so we just accept both.
2026-02-09 15:53:38 -05:00
Eddy (Eduard) Stefes 51affa0394 add tests for s390x-unknown-none-softfloat
tests will check:
- correct emit of assembly for softfloat target
- incompatible set features will emit warnings/errors
- incompatible target tripples in crates will not link
2026-02-09 09:29:16 +01:00
Eddy (Eduard) Stefes 2b1dc3144b add a new s390x-unknown-none-softfloat target
This target is intended to be used for kernel development. Becasue on s390x
float and vector registers overlap we have to disable the vector extension.

The default s390x-unknown-gnu-linux target will not allow use of
softfloat.

Co-authored-by: Jubilee <workingjubilee@gmail.com>
2026-02-09 09:28:54 +01:00
ltdk 28feae0c87 Move bigint helper tracking issues 2026-02-02 18:45:26 -05:00
Nikita Popov e015fc820d Adjust loongarch assembly test
This generates different code on loongarch32r now.
2026-01-27 12:09:39 +01:00
Jonathan Pallant 6ecb3f33f0 Adds two new Tier 3 targets - aarch64v8r-unknown-none and aarch64v8r-unknown-none-softfloat.
The existing `aarch64-unknown-none` target assumes Armv8.0-A as a baseline. However, Arm recently released the Arm Cortex-R82 processor which is the first to implement the Armv8-R AArch64 mode architecture. This architecture is similar to Armv8-A AArch64, however it has a different set of mandatory features, and is based off of Armv8.4. It is largely unrelated to the existing Armv8-R architecture target (`armv8r-none-eabihf`), which only operates in AArch32 mode.

The second `aarch64v8r-unknown-none-softfloat` target allows for possible Armv8-R AArch64 CPUs with no FPU, or for use-cases where FPU register stacking is not desired. As with the existing `aarch64-unknown-none` target we have coupled FPU support and Neon support together - there is no 'has FPU but does not have NEON' target proposed even though the architecture technically allows for it.

This PR was developed by Ferrous Systems on behalf of Arm. Arm is the owner of these changes.
2026-01-26 12:43:52 +00:00
Stuart Cook a6e8a31b86 Rollup merge of #151611 - bonega:improve-is-slice-is-ascii-performance, r=folkertdev
Improve is_ascii performance on x86_64 with explicit SSE2 intrinsics

# Summary

Improves `slice::is_ascii` performance for SSE2 target roughly 1.5-2x on larger inputs.
AVX-512 keeps similiar performance characteristics.

This is building on the work already merged in rust-lang/rust#151259.
In particular this PR improves the default SSE2 performance, I don't consider this a temporary fix anymore.
Thanks to @folkertdev for pointing me to consider `as_chunk` again.

# The implementation:
- Uses 64-byte chunks with 4x 16-byte SSE2 loads OR'd together
- Extracts the MSB mask with a single `pmovmskb` instruction
- Falls back to usize-at-a-time SWAR for inputs < 64 bytes

# Performance impact (vs before rust-lang/rust#151259):
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

  <details>
  <summary>Benchmark Results (click to expand)</summary>

  Benchmarked on AMD Ryzen 9 9950X (AVX-512 capable). Values show relative performance (1.00 = fastest).
  Tops out at 139GB/s for large inputs.

  ### early_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | 1.01 | **1.00** | 13.45 | 1.13 |
  | 1024 | 1.01 | **1.00** | 13.53 | 1.14 |
  | 65536 | 1.01 | **1.00** | 13.99 | 1.12 |
  | 1048576 | 1.02 | **1.00** | 13.29 | 1.12 |

  ### late_non_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 64 | **1.00** | 1.01 | 13.37 | 1.13 |
  | 1024 | 1.10 | **1.00** | 42.42 | 1.95 |
  | 65536 | **1.00** | 1.06 | 42.22 | 1.73 |
  | 1048576 | **1.00** | 1.03 | 34.73 | 1.46 |

  ### pure_ascii

  | Input Size | new_avx512 | new_sse2 | old_avx512 | old_sse2 |
  |------------|------------|----------|------------|----------|
  | 4 | 1.03 | **1.00** | 1.75 | 1.32 |
  | 8 | **1.00** | 1.14 | 3.89 | 2.06 |
  | 16 | **1.00** | 1.04 | 1.13 | 1.62 |
  | 32 | 1.07 | 1.19 | 5.11 | **1.00** |
  | 64 | **1.00** | 1.13 | 13.32 | 1.57 |
  | 128 | **1.00** | 1.01 | 19.97 | 1.55 |
  | 256 | **1.00** | 1.02 | 27.77 | 1.61 |
  | 1024 | **1.00** | 1.02 | 41.34 | 1.84 |
  | 4096 | 1.02 | **1.00** | 45.61 | 1.98 |
  | 16384 | 1.01 | **1.00** | 48.67 | 2.04 |
  | 65536 | **1.00** | 1.03 | 43.86 | 1.77 |
  | 262144 | **1.00** | 1.06 | 41.44 | 1.79 |
  | 1048576 | 1.02 | **1.00** | 35.36 | 1.44 |

  </details>

## Reproduction / Test Projects

Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

- `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
- `fuzz/` - Compares old/new implementations with libfuzzer

Relates to: https://github.com/llvm/llvm-project/issues/176906
2026-01-26 14:36:21 +11:00
Andreas Liljeqvist dbc870afec Mark is_ascii_sse2 as #[inline] 2026-01-25 20:05:08 +01:00
Andreas Liljeqvist cbcd8694c6 Remove x86_64 assembly test for is_ascii
The SSE2 helper function is not inlined across crate boundaries,
so we cannot verify the codegen in an assembly test. The fix is
still verified by the absence of performance regression.
2026-01-25 09:44:04 +01:00
Andreas Liljeqvist a72f68e801 Fix is_ascii performance on x86_64 with explicit SSE2 intrinsics
Use explicit SSE2 intrinsics to avoid LLVM's broken AVX-512
auto-vectorization which generates ~31 kshiftrd instructions.

Performance
- AVX-512: 34-48x faster
- SSE2: 1.5-2x faster

Improves on earlier pr
2026-01-24 22:03:58 +01:00
Matthias Krüger c11be675f4 Rollup merge of #151571 - androm3da:bcain/cstr_merge, r=tgross35
Fix cstring-merging test for Hexagon target

Hexagon assembler uses `.string` directive instead of `.asciz` for null-terminated strings. Both are equivalent but the test was only checking for `.asciz`.

Update the CHECK patterns to accept both directives using `.{{asciz|string}}` regex pattern.
2026-01-24 21:04:17 +01:00
Jonathan 'theJPster' Pallant 96897f016e Add ARMv6 bare-metal targets
Three targets, covering A32 and T32 instructions, and soft-float and
hard-float ABIs. Hard-float not available in Thumb mode. Atomics
in Thumb mode require __sync* functions from compiler-builtins.
2026-01-24 17:29:25 +00:00
Jonathan Brouwer 13f0399a57 Rollup merge of #151259 - bonega:fix-is-ascii-avx512, r=folkertdev
Fix is_ascii performance regression on AVX-512 CPUs when compiling with -C target-cpu=native

## Summary

This PR fixes a severe performance regression in `slice::is_ascii` on AVX-512 CPUs when compiling with `-C target-cpu=native`.

On affected systems, the current implementation achieves only ~3 GB/s for large inputs, compared to ~60–70 GB/s previously (≈20–24× regression). This PR restores the original performance characteristics.

This change is intended as a **temporary workaround** for upstream LLVM poor codegen. Once the underlying LLVM issue is fixed and Rust is able to consume that fix, this workaround should be reverted.

  ## Problem

  When `is_ascii` is compiled with AVX-512 enabled, LLVM's auto-vectorization generates ~31 `kshiftrd` instructions to extract mask bits one-by-one, instead of using the efficient `pmovmskb`
  instruction. This causes a **~22x performance regression**.

  Because `is_ascii` is marked `#[inline]`, it gets inlined and recompiled with the user's target settings, affecting anyone using `-C target-cpu=native` on AVX-512 CPUs.

## Root cause (upstream)

The underlying issue appears to be an LLVM vectorizer/backend bug affecting certain AVX-512 patterns.

An upstream issue has been filed by @folkertdev  to track the root cause: llvm/llvm-project#176906

Until this is resolved in LLVM and picked up by rustc, this PR avoids triggering the problematic codegen pattern.

  ## Solution

  Replace the counting loop with explicit SSE2 intrinsics (`_mm_movemask_epi8`) that force `pmovmskb` codegen regardless of CPU features.

  ## Godbolt Links (Rust 1.92)

  | Pattern | Target | Link | Result |
  |---------|--------|------|--------|
  | Counting loop (old) | Default SSE2 | https://godbolt.org/z/sE86xz4fY | `pmovmskb` |
  | Counting loop (old) | AVX-512 (znver4) | https://godbolt.org/z/b3jvMhGd3 | 31x `kshiftrd` (broken) |
  | SSE2 intrinsics (fix) | Default SSE2 | https://godbolt.org/z/hMeGfeaPv | `pmovmskb` |
  | SSE2 intrinsics (fix) | AVX-512 (znver4) | https://godbolt.org/z/Tdvdqjohn | `vpmovmskb` (fixed) |

  ## Benchmark Results

  **CPU:** AMD Ryzen 5 7500F (Zen 4 with AVX-512)

  ### Default Target (SSE2) — Mixed

  | Size | Before | After | Change |
  |------|--------|-------|--------|
  | 4 B | 1.8 GB/s | 2.0 GB/s | **+11%** |
  | 8 B | 3.2 GB/s | 5.8 GB/s | **+81%** |
  | 16 B | 5.3 GB/s | 8.5 GB/s | **+60%** |
  | 32 B | 17.7 GB/s | 15.8 GB/s | -11% |
  | 64 B | 28.6 GB/s | 25.1 GB/s | -12% |
  | 256 B | 51.5 GB/s | 48.6 GB/s | ~same |
  | 1 KB | 64.9 GB/s | 60.7 GB/s | ~same |
  | 4 KB+ | ~68-70 GB/s | ~68-72 GB/s | ~same |

  ### Native Target (AVX-512) — Up to 24x Faster

  | Size | Before | After | Speedup |
  |------|--------|-------|---------|
  | 4 B | 1.2 GB/s | 2.0 GB/s | **1.7x** |
  | 8 B | 1.6 GB/s | 5.0 GB/s | **3.3x** |
  | 16 B | ~7 GB/s | ~7 GB/s | ~same |
  | 32 B | 2.9 GB/s | 14.2 GB/s | **4.9x** |
  | 64 B | 2.9 GB/s | 23.2 GB/s | **8x** |
  | 256 B | 2.9 GB/s | 47.2 GB/s | **16x** |
  | 1 KB | 2.8 GB/s | 60.0 GB/s | **21x** |
  | 4 KB+ | 2.9 GB/s | ~68-70 GB/s | **23-24x** |

  ### Summary

  - **SSE2 (default):** Small inputs (4-16 B) 11-81% faster; 32-64 B ~11% slower; large inputs unchanged
  - **AVX-512 (native):** 21-24x faster for inputs ≥1 KB, peak ~70 GB/s (was ~3 GB/s)

  Note: this is the pure ascii path, but the story is similar for the others.
  See linked bench project.

  ## Test Plan

  - [x] Assembly test (`slice-is-ascii-avx512.rs`) verifies no `kshiftrd` with AVX-512
  - [x] Existing codegen test updated to `loongarch64`-only (auto-vectorization still used there)
  - [x] Fuzz testing confirms old/new implementations produce identical results (~53M iterations)
  - [x] Benchmarks confirm performance improvement
  - [x] Tidy checks pass

  ## Reproduction / Test Projects

  Standalone validation tools: https://github.com/bonega/is-ascii-fix-validation

  - `bench/` - Criterion benchmarks for SSE2 vs AVX-512 comparison
  - `fuzz/` - Compares old/new implementations with libfuzzer

  ## Related Issues
  - issue opened by @folkertdev llvm/llvm-project#176906
  - Regression introduced in https://github.com/rust-lang/rust/pull/130733
2026-01-24 08:18:05 +01:00
Jonathan Brouwer 42c3cae5e7 Rollup merge of #150556 - thejpster:add-thumbv7a-thumbv7r-thumbv8r, r=petrochenkov
Add Tier 3 Thumb-mode targets for Armv7-A, Armv7-R and Armv8-R

We currently have targets for bare-metal Armv7-R, Armv7-A and Armv8-R, but only in Arm mode. This PR adds five new targets enabling bare-metal support on these architectures in Thumb mode.

This has been tested using https://github.com/rust-embedded/aarch32/compare/main...thejpster:aarch32:support-thumb-mode-v7-v8?expand=1 and they all seem to work as expected.

However, I wasn't sure what to do with the maintainer lists as these are five new targets, but they share the docs page with the existing Arm versions. I can ask the Embedded Devices WG Arm Team about taking on these ones too, but whether Arm themselves want to take them on I guess is a bigger question.
2026-01-24 08:18:05 +01:00
Brian Cain e558544565 Fix cstring-merging test for Hexagon target
Hexagon assembler uses `.string` directive instead of `.asciz` for
null-terminated strings. Both are equivalent but the test was only
checking for `.asciz`.

Update the CHECK patterns to accept both directives using
`.{{asciz|string}}` regex pattern.
2026-01-23 23:45:36 -06:00
Jonathan Brouwer dec8d6ebcf Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726
Add -Z large-data-threshold

This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.

When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.

This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
2026-01-23 11:07:55 +01:00
Andreas Liljeqvist c609cce8cf Merge is_ascii codegen tests using revisions
Combine the x86_64 and loongarch64 is_ascii tests into a single file
using compiletest revisions. Both now test assembly output:

- X86_64: Verifies no broken kshiftrd/kshiftrq instructions (AVX-512 fix)
- LA64: Verifies vmskltz.b instruction is used (auto-vectorization)
2026-01-22 22:18:00 +01:00
Jonathan 'theJPster' Pallant 96647dde77 Add Thumb-mode targets for Armv7-R, Armv7-A and Armv8-R. 2026-01-22 18:37:52 +00:00
Jakob Koschel c222a00e79 Create x86_64-unknown-linux-gnuasan target which enables ASAN by default
As suggested, in order to distribute sanitizer instrumented standard
libraries without introducing new rustc flags, this adds a new dedicated
target. With the target, we can distribute the instrumented standard
libraries through a separate rustup component.
2026-01-20 09:21:53 +00:00
Andreas Liljeqvist a0f9a15b4a Fix is_ascii performance regression on AVX-512 CPUs
When `[u8]::is_ascii()` is compiled with `-C target-cpu=native` on
AVX-512 CPUs, LLVM generates inefficient code. Because `is_ascii` is
marked `#[inline]`, it gets inlined and recompiled with the user's
target settings. The previous implementation used a counting loop that
LLVM auto-vectorizes to `pmovmskb` on SSE2, but with AVX-512 enabled,
LLVM uses k-registers and extracts bits individually with ~31
`kshiftrd` instructions.

This fix replaces the counting loop with explicit SSE2 intrinsics
(`_mm_loadu_si128`, `_mm_or_si128`, `_mm_movemask_epi8`) for x86_64.
`_mm_movemask_epi8` compiles to `pmovmskb`, forcing efficient codegen
regardless of CPU features.

Benchmark results on AMD Ryzen 5 7500F (Zen 4 with AVX-512):
- Default build: ~73 GB/s → ~74 GB/s (no regression)
- With -C target-cpu=native: ~3 GB/s → ~67 GB/s (22x improvement)

The loongarch64 implementation retains the original counting loop
since it doesn't have this issue.

Regression from: https://github.com/rust-lang/rust/pull/130733
2026-01-17 17:38:51 +01:00
Jonathan Brouwer 002b68d628 Rollup merge of #150826 - s390x-asm-f16-vector, r=uweigand,tgross35
Add `f16` inline ASM support for s390x

tracking issue: https://github.com/rust-lang/rust/issues/116909
cc https://github.com/rust-lang/rust/issues/125398

Support the `f16x8` type in inline assembly. Only with the `nnp-assist` feature are there any instructions that make use of this type. Based on the riscv implementation I now cast to `i16x8` when that feature is not enabled.

As far as I'm aware there are no instructions operating on `f16` scalar values. Should we still add support for using them in inline assembly?

r? @tgross35
cc @uweigand
2026-01-13 09:01:29 +01:00
Matthias Krüger f417f55e62 Rollup merge of #150368 - minicore-ordering, r=workingjubilee
adding Ordering enum to minicore.rs, importing minicore in "tests/assembly-llvm/rust-abi-arg-attr.rs" test file

this adds the `Ordering` enum to `minicore.rs`.

consequently, this updates `tests/assembly-llvm/rust-abi-arg-attr.rs` to import `minicore` directly. previously, this test file contained traits like `Copy` `Clone` `PointeeSized`, which were giving a duplicate lang item error, so replace those by importing `minicore` completely.
2026-01-11 09:56:38 +01:00
Folkert de Vries 6f12b86e9c s390x: support f16 and f16x8 in inline assembly 2026-01-09 18:42:46 +01:00
paradoxicalguy 484ea769d3 adding minicore to test file to avoid duplicating lang error 2026-01-09 02:30:33 +00:00