mirror of
https://github.com/rust-lang/rust.git
synced 2026-04-27 18:57:42 +03:00
e8b594b427
Skip stack_start_aligned for immediate-abort This improves startup performance by 16%, shown by an optimized hello-world program. glibc's `pthread_getattr_np` performs expensive syscalls when reading `/proc/self/maps`. That is all wasted with `panic = immediate-abort` active because `init()` immediately discards the return value from `install_main_guard()`. A similar improvement can be seen in environments that don't have `/proc`. This change is safe because the immediately succeeding comment says that we rely on Linux's "own stack-guard mechanism". Tracking issue: https://github.com/rust-lang/rust/issues/147286 # Benchmark Set it up with `cargo new hello-world2`, and replace these files: ```toml # Cargo.toml cargo-features = ["panic-immediate-abort"] [package] name = "hello-world" version = "0.1.0" edition = "2024" [profile.release] lto = true panic = "immediate-abort" codegen-units = 1 opt-level = "z" strip = true # .cargo/config.toml [unstable] build-std = ["std"] ``` ## Before ```console home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 524.8 µs ± 65.1 µs [User: 276.1 µs, System: 187.0 µs] Range (min … max): 446.4 µs … 975.5 µs 3996 runs home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 519.4 µs ± 65.8 µs [User: 282.1 µs, System: 177.7 µs] Range (min … max): 443.2 µs … 830.5 µs 3612 runs home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 520.0 µs ± 64.3 µs [User: 277.1 µs, System: 182.1 µs] Range (min … max): 447.1 µs … 1001.3 µs 3804 runs ``` For a visualization of the problem, run `cargo +stage1 build --release && perf record --call-graph dwarf -F max ./target/release/hello-world2 && perf script | inferno-collapse-perf | inferno-flamegraph > flamegraph.svg`: <img width="3832" height="1216" alt="flamegraph with 17.41% __pthread_getattr_np" src="https://github.com/user-attachments/assets/acc2286e-1582-4772-9e3b-68b5c35e3e70" /> ## After ```console home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2Benchmark 1: target/release/hello-world2 Time (mean ± σ): 444.7 µs ± 57.3 µs [User: 257.4 µs, System: 130.2 µs] Range (min … max): 379.4 µs … 1289.3 µs 3893 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 452.3 µs ± 60.7 µs [User: 261.5 µs, System: 133.5 µs] Range (min … max): 374.9 µs … 1512.4 µs 4177 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. home@daniel-desktop3:~/CLionProjects/hello-world2$ hyperfine -N target/release/hello-world2 Benchmark 1: target/release/hello-world2 Time (mean ± σ): 441.2 µs ± 56.1 µs [User: 256.2 µs, System: 128.8 µs] Range (min … max): 375.0 µs … 760.4 µs 4032 runs ```