Make retags an implicit part of typed copies
Ever since Stacked Borrows was first implemented in Miri, that was done with `Retag` statements: given a place (usually a local variable), those statements find all references stored inside the place and refresh their tags to ensure the aliasing requirements are upheld. However, this is a somewhat unsatisfying approach for multiple reasons:
- It leaves open the [question](https://github.com/rust-lang/unsafe-code-guidelines/issues/371) of where to even put `Retag` statements. Over time, the AddRetag pass settled on one possible answer to this, but it wasn't very canonical.
- For assignments of the form `*ptr = expr`, if the assignment involves copying a reference, we probably want to do a retag -- but if we do a `Retag(*ptr)` as the next instruction, it can be non-trivial to argue that this even retags the right value, so we refrained from doing retags in that case. This has [come up](https://github.com/llvm/llvm-project/pull/160913#issuecomment-3341908717) as a potential issue for Rust making better use of LLVM "captures" annotations. (That said, there might be [other ways](https://github.com/rust-lang/unsafe-code-guidelines/issues/593#issuecomment-4328112031) to obtain this desired optimization.)
- Normal compilation avoids generating retags, but we still generate LLVM IR with `noalias`. What does that even mean? How do MIR optimization passes interact with retags? These are questions we have to figure out to make better use of aliasing information, but currently we can't even really ask such questions.
I think we should resolve all that by making retags part of what happens during a typed copy (a concept and interpreter infrastructure that did not exist yet when retags were initially introduced). Under this proposal, when executing a MIR assignment statement, what conceptually happens is as follows:
- We evaluate the LHS to a place.
- We evaluate the RHS to a value. This does a typed load from memory if needed, raising UB if memory does not contain a valid representation of the assignment's type.
- We walk that value, identify all references inside of it, and retag them. If this happens as part of passing a function argument, this is a protecting retag.
- We store (a representation of) the value into the place.
However, this semantics doesn't fully work: there's a mandatory MIR pass that turns expressions like `&mut ***ptr` into intermediate deref's. Those must *not* do any retags. So far this happened because the AddRetag pass did not add retags for assignments to deref temporaries, but that information is not recorded in cross-crate MIR. Therefore I instead added a field to `Rvalue::Use` to indicate whether this value should be retagged or not. A non-retagging copy seems like a sufficiently canonical primitive that we should be able to express it. Dealing with the fallout from that is a large chunk of the overall diff. (I also considered adding this field to `StatementKind::Assign` instead, but decided against that as we only actually need it for `Rvalue::Use`. I am not sure if this was the right call...)
This neatly answers the question of when retags should occur, and handles cases like `*ptr = expr`. It avoids traversing values twice in Miri. It makes codegen's use of `noalias` sound wrt the actual MIR that it is working on. It also gives us a target semantics to evaluate MIR opts against. However, I did not carefully check all MIR opts -- in particular, GVN needs a thorough look under the new semantics; it currently can turn alias-correct code into alias-incorrect code. (But this PR doesn't make things any worse for normal compilation where the retag indicator is anyway ignored.)
Another side-effect of this PR is that `-Zmiri-disable-validation` now also disables alias checking. It'd be nicer to keep them orthogonal but I find this an acceptable price to pay.
- [rustc benchmark results](https://github.com/rust-lang/rust/pull/154341#issuecomment-4125313290)
- [miri benchmark results](https://github.com/rust-lang/rust/pull/154341#issuecomment-4129840926)
Do not depend on typeck to borrow-check inline consts.
Instead fetch the type from the inline-const's MIR, which comes from exactly the same typeck information.
Add extra symbol check for `.to_owned()`
Follow up of rust-lang/rust#154646
Previously when adding a suggestion for using `Cow::into_owned()` instead of `ToOwned::to_owned()`, the compiler would just convert the methods `Span` into a `String` and do checks on that `String`. This PR adds an extra guard to that suggestion by checking if the method is `sym::to_owned_method`.
r? @davidtwco since you added the review to the previous PR
Previously when adding a suggestion for using `Cow::into_owned()`
instead of `ToOwned::to_owned()`, the compiler would just convert the
methods `Span` into a `String` and do checks on that `String`. This PR
adds an extra guard to that suggestion by checking if the method is
`sym::to_owned_method`.
compiler: Print valid `-Zmir-enable-passes` names if invalid name is used
If a user passes an invalid name to `-Zmir-enable-passes`, print the valid names as a note.
To avoid the annoyance of having to keep blessing test output, completely normalize away the list of names in the test.
The diagnostic is duplicated, but that is not introduced by this commit. In other words, we don't make matters worse. The existing "is unknown and will be ignored" diagnostic is already duplicated, as can be seen in the existing test stderr.
The output is on one long line, but that makes normalization in tests easier, and I don't think we have to overdo this. We can let terminals wrap the line.
<details>
<summary>Click to expand current output</summary>
`note: valid MIR pass names are: AbortUnwindingCalls, AddCallGuards, AddMovesForPackedDrops, AddRetag, CheckAlignment, CheckCallRecursion, CheckConstItemMutation, CheckDropRecursion, CheckEnums, CheckForceInline, CheckInlineAlwaysTargetFeature, CheckLiveDrops, CheckNull, CheckPackedRef, CleanupPostBorrowck, CopyProp, CtfeLimit, DataflowConstProp, DeadStoreElimination-final, DeadStoreElimination-initial, Derefer, DestinationPropagation, EarlyOtherwiseBranch, ElaborateBoxDerefs, ElaborateDrops, EnumSizeOpt, EraseDerefTemps, ForceInline, FunctionItemReferences, GVN, ImpossiblePredicates, Inline, InstSimplify-after-simplifycfg, InstSimplify-before-inline, InstrumentCoverage, JumpThreading, KnownPanicsLint, LowerIntrinsics, LowerSliceLenCalls, Marker, MatchBranchSimplification, MentionedItems, MultipleReturnTerminators, PostAnalysisNormalize, PreCodegen, PromoteTemps, ReferencePropagation, RemoveNoopLandingPads, RemovePlaceMention, RemoveStorageMarkers, RemoveUninitDrops, RemoveUnneededDrops, RemoveZsts, ReorderBasicBlocks, ReorderLocals, RequiredConstsVisitor, SanityCheck, ScalarReplacementOfAggregates, SimplifyCfg-after-unreachable-enum-branching, SimplifyCfg-final, SimplifyCfg-initial, SimplifyCfg-make_shim, SimplifyCfg-post-analysis, SimplifyCfg-pre-optimizations, SimplifyCfg-promote-consts, SimplifyCfg-remove-false-edges, SimplifyComparisonIntegral, SimplifyConstCondition-after-const-prop, SimplifyConstCondition-after-inst-simplify, SimplifyConstCondition-final, SimplifyLocals-after-value-numbering, SimplifyLocals-before-const-prop, SimplifyLocals-final, SingleUseConsts, SsaRangePropagation, StateTransform, StripDebugInfo, Subtyper, UnreachableEnumBranching, UnreachablePropagation, Validator`
</details>
Reuse CTFE MIR for constructors.
For constructors, we manually build the MIR shim we want. We can just have `optimized_mir` call `mir_for_ctfe` instead of rebuilding the shim.
Make `diverging_type_vars` a vec of `TyVid`
r? @lcnr
The following changes, in separate commits:
* Make its elements a `TyVid`, since there should never by any other types than `TyVars` in there
* Make it a vec, since it being a set doesn't make much sense. You never really should do a `contains` on it, since you should normalize the tyvids in the set to their root var first.
Add a `Local::arg(i)` helper constructor
While reading through stuff I was noticing just how many `+1` fixes there were in various places (and comments explaining those fixups), so this adds a new inherent helper on `Local` for making an argument to help make this clearer.
r? mir
Clean up some traits
I was looking at various traits and found some unnecessary trait bounds, and some unnecessary traits. Details in individual commits.
r? @Nadrieril
Improve source code for `librustdoc/visit_ast.rs`
While working on this part of rustdoc, got annoyed at the tuples and decided to transform them into types. Code is now much easier to follow. :3
r? @Urgau
Make stable hashing names consistent (part 1)
This PR starts the implementation of MCP 893. It renames the things that appear in the `HashStable` trait, changing this:
```
pub trait HashStable {
fn hash_stable<Hcx: HashStableContext>(&self, hcx: &mut Hcx, hasher: &mut StableHasher);
}
```
to this:
```
pub trait StableHash {
fn stable_hash<Hcx: StableHashCtxt>(&self, hcx: &mut Hcx, hasher: &mut StableHasher);
}
```
Details in individual commits.
This is the biggest part of the renaming. A follow-up PR will rename the remaining things.
r? @jieyouxu
Specifically:
- `HashStable` -> `StableHash` (trait)
- `HashStable` -> `StableHash` (derive)
- `HashStable_NoContext` -> `StableHash_NoContext` (derive)
Note: there are some names in `compiler/rustc_macros/src/hash_stable.rs`
that are still to be renamed, e.g. `HashStableMode`.
Part of MCP 983.
If a user passes an invalid name to `-Zmir-enable-passes`, print the
valid names as a note.
The diagnostic is duplicated, but that is not introduced by this commit.
In other words, we don't make matters worse. The existing "is unknown
and will be ignored" diagnostic is already duplicated.
To avoid the annoyance of having to keep blessing test output,
completely normalize away the list of names in the test.
Rollup of 4 pull requests
Successful merges:
- rust-lang/rust#154571 (Fix alias path for rustdoc)
- rust-lang/rust#155749 (`-Znext-solver` Ignore region constraints from the nested goals in leakcheck)
- rust-lang/rust#156026 (`bufreader::Buffer`: Remove leftover note about `initialized` field)
- rust-lang/rust#156063 (Map `WSAESHUTDOWN` to `io::ErrorKind::BrokenPipe`)
`bufreader::Buffer`: Remove leftover note about `initialized` field
Just a boring little doc fix! : v)
https://github.com/rust-lang/rust/pull/150129 reworked the `initialized` field to be a `bool` instead of a `usize`.
And then https://github.com/rust-lang/rust/pull/155314 reworked this field's comment (among other things).
But, there's still a leftover note in the comment, which no longer makes sense:
``Note that while this often the same as `filled`, it doesn't need to be.``
This is referencing that back when `initialized` was a `usize`, it was common for it to have the same value as `filled`.
----
Fun fact: there's a typo in the note too! It's missing an "is" before or after "often".
Fix alias path for rustdoc
I ran this command:
```console
x build --stage 1 --skip rustdoc
Building bootstrap
Compiling bootstrap v0.0.0 (/Users/yukang/rust/src/bootstrap)
Finished `dev` profile [unoptimized] target(s) in 1.08s
/Users/yukang/rust/build/aarch64-apple-darwin/ci-llvm/bin/llvm-strip does not exist; skipping copy
Building stage1 compiler artifacts (stage0 -> stage1, aarch64-apple-darwin)
Finished `release` profile [optimized + debuginfo] target(s) in 0.45s
Creating a sysroot for stage1 compiler (use `rustup toolchain link 'name' build/host/stage1`)
Building stage1 library artifacts{alloc, compiler_builtins, core, panic_abort, panic_unwind, proc_macro, rustc-std-workspace-core, std, std_detect, sysroot, test, unwind} (stage1 -> stage1, aarch64-apple-darwin)
Finished `dist` profile [optimized + debuginfo] target(s) in 0.10s
Skipping Set({build::src/tools/rustdoc}) because it is excluded
Building stage1 rustdoc_tool_binary (stage0 -> stage1, aarch64-apple-darwin)
Finished `release` profile [optimized + debuginfo] target(s) in 0.19s
```
expect all `rustdoc` related compiling phase will be exlcuded, but from the log we can see `rustdoc_tool_binary` is still compiled.
`src/tools/rustdoc` and `src/librustdoc` are documented as aliases in path set here:
https://github.com/rust-lang/rust/blob/a25435bcf7cfc9b953d356eda3a51db8da9e3386/src/bootstrap/src/core/builder/mod.rs#L355-L360
we can also see here two paths are treated as alias:
https://github.com/rust-lang/rust/blob/a25435bcf7cfc9b953d356eda3a51db8da9e3386/src/bootstrap/src/core/build_steps/check.rs#L818-L822
but bootstrap registered them as two different sets with `.path(..).path(...)`:
https://github.com/rust-lang/rust/blob/a25435bcf7cfc9b953d356eda3a51db8da9e3386/src/bootstrap/src/core/build_steps/tool.rs#L691-L693
That meant commands like `x build --skip rustdoc`, `--skip src/tools/rustdoc`, or `--skip src/librustdoc` only excluded one side.
linker-messages is warn-by-default again
cc rust-lang/rust#136096
I ended up keeping it a lint and adding an option for lints to ignore `-Dwarnings` (there was already a lint that did that actually, it was just hard-coded in rustc_middle instead of in rustc_lint_defs like I'd expect). This allows people to actually see the warnings without them failing the build in CI.
It being a `Set` does not really make sense. You never really should do a `contains` on it, since you should normalize the tyvid to its root var first.
Return a single diagnostic from `lex_token_trees`.
It currently returns a `Vec` but in practice it always has one diagnostic in it.
LLM disclosure: Claude Code identified this when I asked it to review `tokentrees.rs`. I made the change by hand and tested it myself.
r? @chenyukang
-Zembed-source: also embed external source
Hi,
I've been using `-Zembed-source` for a while and noticed that some sources are not embedded when compiling in release mode. See the minimal reproducer below for missing embedded source code.
The current implementation only emits source from the `src` field in `SourceFile`, but for multi-codegen-unit scenarios the source might only be available via the `external_src` field.
I've updated the LLVM and Cranelift backends to fall back to `external_src` if `src` is not available.
Minimal reproducer:
```console
$ cat Cargo.toml
[package]
name = "repro"
version = "0.0.1"
edition = "2021"
[profile.release]
strip = "none"
debug = true
$ cat src/lib.rs
// marker comment
pub fn foo() -> u64 { 42 }
$ cat src/main.rs
fn main() {
println!("{}", repro::foo());
}
$ RUSTFLAGS="-g -Zembed-source=yes -Zdwarf-version=5" cargo build -r
Compiling repro v0.0.1 (/tmp/repro)
Finished `release` profile [optimized + debuginfo] target(s) in 0.08s
$ # Note: Source for lib.rs is missing here:
$ llvm-dwarfdump --debug-line target/release/repro | grep "marker comment"
$ RUSTFLAGS="-g -Zembed-source=yes -Zdwarf-version=5" cargo +stage1 build -r
Finished `release` profile [optimized + debuginfo] target(s) in 0.04s
$ llvm-dwarfdump --debug-line target/release/repro | grep "marker comment"
source: "// marker comment\npub fn foo() -> u64 { 42 }\n"
```
Thanks!
Adds a couple UI tests for polonius
I went through all the open issues labeled `fixed-by-polonius` and from that created two UI tests based on issues that seemed a little novel.
One for rust-lang/rust#92038 and another for rust-lang/rust#70044.
Both tests fail under NLL but pass with polonius (legacy and alpha).