Add `-Zsanitize=kernel-hwaddress`
The Linux kernel has a config option called `CONFIG_KASAN_SW_TAGS` that enables `-fsanitize=kernel-hwaddress`. This is not supported by Rust.
One slightly awkward detail is that `#[sanitize(address = "off")]` applies to both `-Zsanitize=address` and `-Zsanitize=kernel-address`. Probably it was done this way because both are the same LLVM pass. I replicated this logic here for hwaddress, but it might be undesirable.
Note that `#[sanitize(kernel_hwaddress = "off")]` could be supported as an annotation on statics, but since it's also missing for `#[sanitize(hwaddress = "off")]`, I did not add it.
MCP: https://github.com/rust-lang/compiler-team/issues/975
Tracking issue: https://github.com/rust-lang/rust/issues/154171
cc @rcvalle @maurer @ojeda
Rename `target.abi` to `target.cfg_abi` and enum-ify llvm_abiname
See [Zulip](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/De-spaghettifying.20ABI.20controls/with/578893542) for more context. Discussed a bit in https://github.com/rust-lang/rust/pull/153769#discussion_r2934399038 too.
This renames `target.abi` to `target.cfg_abi` to make it less likely that someone will use it to determine things about the actual ccABI, i.e. the calling convention used on the target. `target.abi` does not control that calling convention, it just *sometimes* informs the user about that calling convention (and also about other aspects of the ABI).
Also turn llvm_abiname into an enum to make it more natural to match on.
Cc @workingjubilee @madsmtm
remove -Csoft-float
This fixes https://github.com/rust-lang/rust/issues/129893 by removing the offending flag.
The flag has been added in pre-1.0 times (https://github.com/rust-lang/rust/pull/9617) without much discussion, probably with the intent to mirror `-msoft-float` in C compilers. It never properly mirrored clang though because it only affected the LLVM float ABI setting, not the "soft-float" target feature (the clang flag sets both). It is also blatantly unsound because it affects how float arguments are passed, making it UB to invoke parts of the standard library.
The flag got deprecated with Rust 1.83 (released November 2024), and in the year since then, nobody spoke up in rust-lang/rust#129893 to have it preserved. I think it is time to remove it. I can totally imagine us bringing back a similar flag, properly registered as a target modifier to preserve soundness, but that should come with a coherent design that works for more than one architecture (the flag only does anything on ARM).
Blocked on approval of https://github.com/rust-lang/compiler-team/issues/971.
Fixes https://github.com/rust-lang/rust/issues/154106
sess: `-Zbranch-protection` is a target modifier
`-Zbranch-protection` only makes sense if the entire crate graph has the option set, otherwise the security properties that branch protection provides won't be effective - hence a target modifier. This flag is unstable so I don't think this warrants an MCP.
Couple of driver interface improvements
* Pass Session to `make_codegen_backend` callback. This simplifies some code in miri.
* Move env/file_depinfo from ParseSess to Session. There is no reason it has to be in ParseSess rather than Session.
* Rename hash_untracked_state to track_state to indicate that it isn't just used for hashing state, but also for adding env vars and files to be tracked through the dep info file.
`-Zbranch-protection` only makes sense if the entire crate graph has
the option set, otherwise the security properties that branch protection
provides won't be effective. This flag is unstable so I don't think this
warrants an MCP.
Fallback to no LTO doesn't work in practice as Cargo asks rustc to
produce LTO-only rlibs with -Clinker-plugin-lto without providing any
indication if they will be used for thin or fat LTO, so we can't disable
-Clinker-plugin-lto for ThinLTO when using cg_gcc.
Optimize dependency file search
I tried to look into the slowdown reported in https://github.com/rust-lang/cargo/issues/16665.
I created a Rust hello world program, and used this Python script to create a directory containing 200k files:
```python
from pathlib import Path
dir = Path("deps")
dir.mkdir(parents=True, exist_ok=True)
for i in range(200000):
path = dir / f"file{i:07}.o"
with open(path, "w") as f:
f.write("\n")
```
Then I tried to do various small microoptimalizations and simplifications to the code that iterates the search directories. Each individual commit improved performance, with the third one having the biggest effect.
Here are the results on `main` vs the last commit with the stage1 compiler on Linux, using `hyperfine "rustc +stage1 src/main.rs -L deps" -r 30` (there's IO involved, so it's good to let it run for a while):
```bash
Benchmark 1: rustc +stage1 src/main.rs -L deps
Time (mean ± σ): 299.4 ms ± 2.7 ms [User: 161.9 ms, System: 144.9 ms]
Range (min … max): 294.8 ms … 307.1 ms 30 runs
Benchmark 1: rustc +stage1 src/main.rs -L deps
Time (mean ± σ): 208.1 ms ± 4.5 ms [User: 87.3 ms, System: 128.7 ms]
Range (min … max): 202.4 ms … 219.6 ms 30 runs
```
Would be cool if someone could try this on macOS (maybe @ehuss - not sure if you have macOS or you only commented about its behavior on the Cargo issue :) ).
I also tried to prefilter the paths (not in this PR); right now we load everything and then we filter files with given prefixes, that's wasteful. Filtering just files starting with `lib` would get us down to ~150ms here. (The baseline without `-L` is ~80ms on my PC). The rest of the 70ms is essentially allocations from iterating the directory entries and sorting. That would be very hard to change - iterating the directory entries (de)allocates a lot of intermediate paths :( We'd have to implement the iteration by hand with either arena allocation, or at least some better management of memory.
r? @nnethercote
As far as I can tell it was introduced to allow fat LTO with
-Clinker-plugin-lto. Later a change was made to automatically disable
ThinLTO summary generation when -Clinker-plugin-lto -Clto=fat is used,
so we can safely remove it.
It was just a dummy implementation to workarround the fact that thin
local lto is the default in rustc. By adding a thin_lto_supported thin
local lto can be automatically disabled for cg_gcc, removing the need
for this dummy implementation. This makes improvements to the LTO
handling on the cg_ssa side a lot easier.
skip codegen for intrinsics with big fallback bodies if backend does not need them
This hopefully fixes the perf regression from https://github.com/rust-lang/rust/pull/148478. I only added the intrinsics with big fallback bodies to the list; it doesn't seem worth the effort of going through the entire list.
Fixes https://github.com/rust-lang/rust/issues/149945
Cc @scottmcm @bjorn3