Do not run jump-threading for GPUs
GPU targets have convergent operations that must not be duplicated or
moved in or out of control-flow.
An example convergent operation is a barrier/syncthreads.
The only MIR pass affected by this is jump-threading, it can duplicate
calls. Disable jump-hreading for GPU targets to prevent generating
incorrect code.
This affects the amdgpu and nvptx targets.
Fixesrust-lang/rust#137086, see this issue for details.
Tracking issue: rust-lang/rust#135024
cc @RDambrosio016 @kjetilkjeka for nvptx
cc @ZuseZ4
`rustc_middle` is enormous and it's always good to move things out of it
where possible. `maybe_loop_headers` is easy to move because it has a
single use in `jump_threading.rs`.
GPU targets have convergent operations that must not be duplicated or
moved in or out of control-flow.
An example convergent operation is a barrier/syncthreads.
The only MIR pass affected by this is jump-threading, it can duplicate
calls. Disable jump-hreading for GPU targets to prevent generating
incorrect code.
This affects the amdgpu and nvptx targets.
JumpThreading: compute place and value indices on-demand
Profiling JumpThreading reveals that a large part of the runtime happens constructing the place and value `Map`. This is unfortunate, as jump-threading may end up not even doing anything.
The cause for this large up-front cost is following: `Map` attempts to create a `PlaceIndex` for each place that *may* hold a relevant value. This means all places that appear in MIR, but also all places whose value is accessed by a projection of a copy of a larger place.
This PR refactors the creation of `Map` to happen on-demand: place and value indices are created when threading computation happens.
The up-front mode is still relevant for DataflowConstProp, so is not touched.
Validate CopyForDeref and DerefTemps better and remove them from runtime MIR
(split from my WIP rust-lang/rust#145344)
This PR:
- Removes `Rvalue::CopyForDeref` and `LocalInfo::DerefTemp` from runtime MIR
- Using a new mir pass `EraseDerefTemps`
- `CopyForDeref(x)` is turned into `Use(Copy(x))`
- `DerefTemp` is turned into `Boring`
- Not sure if this part is actually necessary, it made more sense in rust-lang/rust#145344 with `DerefTemp` storing actual data that I wanted to keep from having to be kept in sync with the rest of the body in runtime MIR
- Checks in validation that `CopyForDeref` and `DerefTemp` are only used together
- Removes special handling for `CopyForDeref` from many places
- Removes `CopyForDeref` from `custom_mir` reverting rust-lang/rust#111587
- In runtime MIR simple copies can be used instead
- In post cleanup analysis MIR it was already wrong to use due to the lack of support for creating `DerefTemp` locals
- Possibly this should be its own PR?
- Adds an argument to `deref_finder` to avoid creating new `DerefTemp`s and `CopyForDeref` in runtime MIR.
- Ideally we would just avoid making intermediate derefs instead of fixing it at the end of a pass / during shim building
- Removes some usages of `deref_finder` that I found out don't actually do anything
r? oli-obk
Use a closure instead of three chained iterators
Fixes the perf regression from #123948
That PR had chained a third option to the iterator which apparently didn't optimize well
Introduce Arena::try_alloc_from_iter.
`alloc_from_iter` already collects the iterator for reentrancy. So adding an early exit for a fallible iterator integrates naturally into the code. This avoids the other solution to allocate and dump the allocation.
take 2
open up coroutines
tweak the wordings
the lint works up until 2021
We were missing one case, for ADTs, which was
causing `Result` to yield incorrect results.
only include field spans with significant types
deduplicate and eliminate field spans
switch to emit spans to impl Drops
Co-authored-by: Niko Matsakis <nikomat@amazon.com>
collect drops instead of taking liveness diff
apply some suggestions and add explantory notes
small fix on the cache
let the query recurse through coroutine
new suggestion format with extracted variable name
fine-tune the drop span and messages
bugfix on runtime borrows
tweak message wording
filter out ecosystem types earlier
apply suggestions
clippy
check lint level at session level
further restrict applicability of the lint
translate bid into nop for stable mir
detect cycle in type structure
the behavior of the type system not only depends on the current
assumptions, but also the currentnphase of the compiler. This is
mostly necessary as we need to decide whether and how to reveal
opaque types. We track this via the `TypingMode`.
- Replace non-standard names like 's, 'p, 'rg, 'ck, 'parent, 'this, and
'me with vanilla 'a. These are cases where the original name isn't
really any more informative than 'a.
- Replace names like 'cx, 'mir, and 'body with vanilla 'a when the lifetime
applies to multiple fields and so the original lifetime name isn't
really accurate.
- Put 'tcx last in lifetime lists, and 'a before 'b.
In all cases the struct can own the relevant thing instead of having a
reference to it. This makes the code simpler, and in some cases removes
a struct lifetime.
Because that's now the only crate that uses it.
Moving stuff out of `rustc_middle` is always welcome.
I chose to use `impl crate::MirPass`/`impl crate::MirLint` (with
explicit `crate::`) everywhere because that's the only mention of
`MirPass`/`MirLint` used in all of these files. (Prior to this change,
`MirPass` was mostly imported via `use rustc_middle::mir::*` items.)
Remove `#[macro_use] extern crate tracing`, round 4
Because explicit importing of macros via use items is nicer (more standard and readable) than implicit importing via #[macro_use]. Continuing the work from #124511, #124914, and #125434. After this PR no `rustc_*` crates use `#[macro_use] extern crate tracing` except for `rustc_codegen_gcc` which is a special case and I will do separately.
r? ```@jieyouxu```