remove -Csoft-float
This fixes https://github.com/rust-lang/rust/issues/129893 by removing the offending flag.
The flag has been added in pre-1.0 times (https://github.com/rust-lang/rust/pull/9617) without much discussion, probably with the intent to mirror `-msoft-float` in C compilers. It never properly mirrored clang though because it only affected the LLVM float ABI setting, not the "soft-float" target feature (the clang flag sets both). It is also blatantly unsound because it affects how float arguments are passed, making it UB to invoke parts of the standard library.
The flag got deprecated with Rust 1.83 (released November 2024), and in the year since then, nobody spoke up in rust-lang/rust#129893 to have it preserved. I think it is time to remove it. I can totally imagine us bringing back a similar flag, properly registered as a target modifier to preserve soundness, but that should come with a coherent design that works for more than one architecture (the flag only does anything on ARM).
Blocked on approval of https://github.com/rust-lang/compiler-team/issues/971.
Fixes https://github.com/rust-lang/rust/issues/154106
sess: `-Zbranch-protection` is a target modifier
`-Zbranch-protection` only makes sense if the entire crate graph has the option set, otherwise the security properties that branch protection provides won't be effective - hence a target modifier. This flag is unstable so I don't think this warrants an MCP.
Couple of driver interface improvements
* Pass Session to `make_codegen_backend` callback. This simplifies some code in miri.
* Move env/file_depinfo from ParseSess to Session. There is no reason it has to be in ParseSess rather than Session.
* Rename hash_untracked_state to track_state to indicate that it isn't just used for hashing state, but also for adding env vars and files to be tracked through the dep info file.
`-Zbranch-protection` only makes sense if the entire crate graph has
the option set, otherwise the security properties that branch protection
provides won't be effective. This flag is unstable so I don't think this
warrants an MCP.
Fallback to no LTO doesn't work in practice as Cargo asks rustc to
produce LTO-only rlibs with -Clinker-plugin-lto without providing any
indication if they will be used for thin or fat LTO, so we can't disable
-Clinker-plugin-lto for ThinLTO when using cg_gcc.
Optimize dependency file search
I tried to look into the slowdown reported in https://github.com/rust-lang/cargo/issues/16665.
I created a Rust hello world program, and used this Python script to create a directory containing 200k files:
```python
from pathlib import Path
dir = Path("deps")
dir.mkdir(parents=True, exist_ok=True)
for i in range(200000):
path = dir / f"file{i:07}.o"
with open(path, "w") as f:
f.write("\n")
```
Then I tried to do various small microoptimalizations and simplifications to the code that iterates the search directories. Each individual commit improved performance, with the third one having the biggest effect.
Here are the results on `main` vs the last commit with the stage1 compiler on Linux, using `hyperfine "rustc +stage1 src/main.rs -L deps" -r 30` (there's IO involved, so it's good to let it run for a while):
```bash
Benchmark 1: rustc +stage1 src/main.rs -L deps
Time (mean ± σ): 299.4 ms ± 2.7 ms [User: 161.9 ms, System: 144.9 ms]
Range (min … max): 294.8 ms … 307.1 ms 30 runs
Benchmark 1: rustc +stage1 src/main.rs -L deps
Time (mean ± σ): 208.1 ms ± 4.5 ms [User: 87.3 ms, System: 128.7 ms]
Range (min … max): 202.4 ms … 219.6 ms 30 runs
```
Would be cool if someone could try this on macOS (maybe @ehuss - not sure if you have macOS or you only commented about its behavior on the Cargo issue :) ).
I also tried to prefilter the paths (not in this PR); right now we load everything and then we filter files with given prefixes, that's wasteful. Filtering just files starting with `lib` would get us down to ~150ms here. (The baseline without `-L` is ~80ms on my PC). The rest of the 70ms is essentially allocations from iterating the directory entries and sorting. That would be very hard to change - iterating the directory entries (de)allocates a lot of intermediate paths :( We'd have to implement the iteration by hand with either arena allocation, or at least some better management of memory.
r? @nnethercote
As far as I can tell it was introduced to allow fat LTO with
-Clinker-plugin-lto. Later a change was made to automatically disable
ThinLTO summary generation when -Clinker-plugin-lto -Clto=fat is used,
so we can safely remove it.
It was just a dummy implementation to workarround the fact that thin
local lto is the default in rustc. By adding a thin_lto_supported thin
local lto can be automatically disabled for cg_gcc, removing the need
for this dummy implementation. This makes improvements to the LTO
handling on the cg_ssa side a lot easier.
skip codegen for intrinsics with big fallback bodies if backend does not need them
This hopefully fixes the perf regression from https://github.com/rust-lang/rust/pull/148478. I only added the intrinsics with big fallback bodies to the list; it doesn't seem worth the effort of going through the entire list.
Fixes https://github.com/rust-lang/rust/issues/149945
Cc @scottmcm @bjorn3
Add a `documentation` remapping path scope for rustdoc usage
This PR adds a new remapping path scope for rustdoc usage: `documentation`, instead of rustdoc abusing the other scopes for it's usage.
Like remapping paths in rustdoc, this scope is unstable. (rustdoc doesn't even have yet an equivalent to [rustc `--remap-path-scope`](https://doc.rust-lang.org/nightly/rustc/remap-source-paths.html#--remap-path-scope)).
I also took the opportunity to add a bit of documentation in rustdoc book.
Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.
When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.
This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.