Fallback to no LTO doesn't work in practice as Cargo asks rustc to
produce LTO-only rlibs with -Clinker-plugin-lto without providing any
indication if they will be used for thin or fat LTO, so we can't disable
-Clinker-plugin-lto for ThinLTO when using cg_gcc.
Optimize dependency file search
I tried to look into the slowdown reported in https://github.com/rust-lang/cargo/issues/16665.
I created a Rust hello world program, and used this Python script to create a directory containing 200k files:
```python
from pathlib import Path
dir = Path("deps")
dir.mkdir(parents=True, exist_ok=True)
for i in range(200000):
path = dir / f"file{i:07}.o"
with open(path, "w") as f:
f.write("\n")
```
Then I tried to do various small microoptimalizations and simplifications to the code that iterates the search directories. Each individual commit improved performance, with the third one having the biggest effect.
Here are the results on `main` vs the last commit with the stage1 compiler on Linux, using `hyperfine "rustc +stage1 src/main.rs -L deps" -r 30` (there's IO involved, so it's good to let it run for a while):
```bash
Benchmark 1: rustc +stage1 src/main.rs -L deps
Time (mean ± σ): 299.4 ms ± 2.7 ms [User: 161.9 ms, System: 144.9 ms]
Range (min … max): 294.8 ms … 307.1 ms 30 runs
Benchmark 1: rustc +stage1 src/main.rs -L deps
Time (mean ± σ): 208.1 ms ± 4.5 ms [User: 87.3 ms, System: 128.7 ms]
Range (min … max): 202.4 ms … 219.6 ms 30 runs
```
Would be cool if someone could try this on macOS (maybe @ehuss - not sure if you have macOS or you only commented about its behavior on the Cargo issue :) ).
I also tried to prefilter the paths (not in this PR); right now we load everything and then we filter files with given prefixes, that's wasteful. Filtering just files starting with `lib` would get us down to ~150ms here. (The baseline without `-L` is ~80ms on my PC). The rest of the 70ms is essentially allocations from iterating the directory entries and sorting. That would be very hard to change - iterating the directory entries (de)allocates a lot of intermediate paths :( We'd have to implement the iteration by hand with either arena allocation, or at least some better management of memory.
r? @nnethercote
As far as I can tell it was introduced to allow fat LTO with
-Clinker-plugin-lto. Later a change was made to automatically disable
ThinLTO summary generation when -Clinker-plugin-lto -Clto=fat is used,
so we can safely remove it.
It was just a dummy implementation to workarround the fact that thin
local lto is the default in rustc. By adding a thin_lto_supported thin
local lto can be automatically disabled for cg_gcc, removing the need
for this dummy implementation. This makes improvements to the LTO
handling on the cg_ssa side a lot easier.
skip codegen for intrinsics with big fallback bodies if backend does not need them
This hopefully fixes the perf regression from https://github.com/rust-lang/rust/pull/148478. I only added the intrinsics with big fallback bodies to the list; it doesn't seem worth the effort of going through the entire list.
Fixes https://github.com/rust-lang/rust/issues/149945
Cc @scottmcm @bjorn3
Add a `documentation` remapping path scope for rustdoc usage
This PR adds a new remapping path scope for rustdoc usage: `documentation`, instead of rustdoc abusing the other scopes for it's usage.
Like remapping paths in rustdoc, this scope is unstable. (rustdoc doesn't even have yet an equivalent to [rustc `--remap-path-scope`](https://doc.rust-lang.org/nightly/rustc/remap-source-paths.html#--remap-path-scope)).
I also took the opportunity to add a bit of documentation in rustdoc book.
Add -Z large-data-threshold
This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64.
When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB.
This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.
Allow invoking all help options at once
Implements rust-lang/rust#150442
Help messages are printed in the order they are invoked, but only once (e.g. `--help -C help --help` prints regular help then codegen help).
Lint groups (`-Whelp`) are always printed last due to necessarily coming later in the compiler pipeline.
Adds `help` flags to `-C` and `-Z` to avoid an error in `build_session_options`.
cc @bjorn3 [#t-compiler/major changes > Allow combining `--help -C help -Z help -… compiler-team#954 @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/233931-t-compiler.2Fmajor-changes/topic/Allow.20combining.20.60--help.20-C.20help.20-Z.20help.20-.E2.80.A6.20compiler-team.23954/near/564358190) - this currently maintains the behaviour of unrecognized options always failing-fast, as this happens within `getopt`s parsing, before we can even check if help options are present.
Destabilise `target-spec-json`
Per rust-lang/compiler-team#944:
> Per https://github.com/rust-lang/rust/issues/71009, the ability to load target spec JSONs was stabilised accidentally. Within the team, we've always considered the format to be unstable and have changed it freely. This has been feasible as custom targets can only be used with core, like any other target, and so custom targets de-facto require nightly to be used (i.e. to build core manually or use Cargo's -Zbuild-std).
>
> Current build-std RFCs (https://github.com/rust-lang/rfcs/pull/3873, https://github.com/rust-lang/rfcs/pull/3874) propose a mechanism for building core on stable (at the request of Rust for Linux), which combined with a stable target-spec-json format, permit the current format to be used much more widely on stable toolchains. This would prevent us from improving the format - making it less tied to LLVM, switching to TOML, enabling keys in the spec to be stabilised individually, etc.
>
> De-stabilising the format gives us the opportunity to improve the format before it is too challenging to do so. Internal company toolchains and projects like Rust for Linux already use target-spec-json, but must use nightly at some point while doing so, so while it could be inconvenient for those users to destabilise this, it is hoped that an minimal alternative that we could choose to stabilise can be proposed relatively quickly.
This flag allows specifying the threshold size for placing static data
in large data sections when using the medium code model on x86-64.
When using -Ccode-model=medium, data smaller than this threshold uses
RIP-relative addressing (32-bit offsets), while larger data uses
absolute 64-bit addressing. This allows the compiler to generate more
efficient code for smaller data while still supporting data larger than
2GB.
This mirrors the -mlarge-data-threshold flag available in GCC and Clang.
The default threshold is 65536 bytes (64KB) if not specified, matching
LLVM's default behavior.
Port `#[rustc_must_implement_one_of]` to attribute parser
Stumbled upon a weird (bug ?) behaviour while making this PR
it seems like it is possible to reach `check_attr.rs` checks without the attribute allowed target checks having already been finished, I added a comment about how to reproduce this in `check_attr.rs`
otherwise good to note is that a bunch of code was moved from `compiler/rustc_hir_analysis/src/collect.rs` to `check_attr.rs`
r? `@JonathanBrouwer`