[style] rustfmt `match`es with comments in or-patterns
Using https://github.com/rust-lang/rustfmt/pull/6893, I reformatted the whole codebase. The result is that `match`es that *should have* been formatted under normal circumstances but are getting skipped now got their expected format. These match expressions were being entirely skipped because they contain or-patterns with comments in between patterns, causing rustfmt to bail out entirely. The or-patterns with comments themselves remain untouched, but now the match arm bodies and other patterns without comments do get formatted under that PR.
Because the fix in rustfmt isn't landed yet, I reworked some of the or-patterns with comments so that formatting doesn't regress. Tried doing this only in larger blocks that are more likely to regress in the meantime.
(Introduced and) removed a bunch of stray backticks \` likely left after an editor autoclosed the intended closing \`, resulting in <code>\`name\`\`</code> in comments.
DestinationPropagation: compute liveness as ranges instead of traveling bitsets
The current implementation of `save_as_liveness` is very slow, and consists in inserting a traveling bitset in an interval set.
As the `MaybeLiveLocals` has a gen-kill property, we can leverage it to make it faster. "Gen" is creating a new interval. "Kill" is ending this interval, ripe to save in the interval set.
Using https://github.com/rust-lang/rustfmt/pull/6893, reformat the codebase. The result is that matches that *would have* been formatted under normal circumstances get their expected format. These match expressions were being entirely skipped because they contain or-patterns with comments in between patterns, causing rustfmt to bail out entirely. The or-patterns with comments themselves remain untouched, but now the match arm bodies and other patterns without comments do get formatted under that PR.
Because the fix in rustfmt isn't landed yet, I reworked some of the or-patterns with comments so that formatting doesn't regress. Tried doing this only in larger blocks that are more likely to regress in the meantime.
Change `SwitchInt` handling in dataflow analysis.
We call `get_switch_int_data` once for the switch and then pass that data to `apply_switch_int_edge_effect` for each switch target.
The only case in practice is `MaybePlacesSwitchIntData` which does an awkward thing, maintaining an index into the discriminants and updating it on each call to `apply_switch_int_edge_effect`.
This commit changes things to do more work up front in `get_switch_int_data`, in order to then do less work in `apply_switch_int_edge_effect`. This avoids the need for the `variants` and `next_discr` methods and the discriminants index. Overall it's a little simpler.
r? @cjgillot
We call `get_switch_int_data` once for the switch and then pass that
data to `apply_switch_int_edge_effect` for each switch target.
The only case in practice is `MaybePlacesSwitchIntData` which does an
awkward thing, maintaining an index into the discriminants and updating
it on each call to `apply_switch_int_edge_effect`.
This commit changes things to do more work up front in
`get_switch_int_data`, in order to then do less work in
`apply_switch_int_edge_effect`. This avoids the need for the `variants`
and `next_discr` methods and the discriminants index. Overall it's a
little simpler.
Fix some comments about dataflow analysis.
Mostly in the examples in `initialized.rs`. In particular, the `EverInitializedPlaces` example currently doesn't cover how it's initialization sites that are tracked, rather than local variables (that's the `b_0`/`b_1` distinction in the example.)
r? @cjgillot
Mostly in the examples in `initialized.rs`. In particular, the
`EverInitializedPlaces` example currently doesn't cover how it's
initialization sites that are tracked, rather than local variables
(that's the `b_0`/`b_1` distinction in the example.)
Change the `SmallVec` size from 4 to 1, because that's sufficient in the
vast majority of cases. (This doesn't affect performance in practice, so
it's more of a code clarity change than a performance change.)
JumpThreading: compute place and value indices on-demand
Profiling JumpThreading reveals that a large part of the runtime happens constructing the place and value `Map`. This is unfortunate, as jump-threading may end up not even doing anything.
The cause for this large up-front cost is following: `Map` attempts to create a `PlaceIndex` for each place that *may* hold a relevant value. This means all places that appear in MIR, but also all places whose value is accessed by a projection of a copy of a larger place.
This PR refactors the creation of `Map` to happen on-demand: place and value indices are created when threading computation happens.
The up-front mode is still relevant for DataflowConstProp, so is not touched.
Compute jump threading opportunities in a single pass
The current implementation of jump threading walks MIR CFG backwards from each `SwitchInt` terminator. This PR replaces this by a single postorder traversal of MIR. In theory, we could do a full fixpoint dataflow analysis, but this has low returns as we forbid threading through a loop header.
The second commit in this PR modifies the carried state to a lighter data structure. The current implementation uses some kind of `IndexVec<ValueIndex, &[Condition]>`. This is needlessly heavy, as the state rarely ever carries more than a few `Condition`s. The first commit replaces this state with a simpler `&[Condition]`, and puts the corresponding `ValueIndex` inside `Condition`.
The three later commits are perf tweaks.
The sixth commit is the main change. Instead of carrying the goto target inside the condition, we maintain a set of conditions associated with each block, and their consequences in following blocks. Think: if this condition is fulfilled in this block, then that condition is fulfilled in that block. This makes the threading algorithm much easier to implement, without the extra bookkeeping of `ThreadingOpportunity` we had.
Later commits modify that algorithm to shrink the set of duplicated blocks. By propagating fulfilled conditions down the CFG, and trimming costly threads.
Add `overflow_checks` intrinsic
This adds an intrinsic which allows code in a pre-built library to inherit the overflow checks option from a crate depending on it. This enables code in the standard library to explicitly change behavior based on whether `overflow_checks` are enabled, regardless of the setting used when standard library was compiled.
This is very similar to the `ub_checks` intrinsic, and refactors the two to use a common mechanism.
The primary use case for this is to allow the new `RangeFrom` iterator to yield the maximum element before overflowing, as requested [here](https://github.com/rust-lang/rust/issues/125687#issuecomment-2151118208). This PR includes a working `IterRangeFrom` implementation based on this new intrinsic that exhibits the desired behavior.
[Prior discussion on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/Ability.20to.20select.20code.20based.20on.20.60overflow_checks.60.3F)
`Results` used to contain an `Analysis`, but it was removed in #140234.
That change made sense because the analysis was mutable but the entry
states were immutable and it was good to separate them so the mutability
of the different pieces was clear.
Now that analyses are immutable there is no need for the separation,
lots of analysis+results pairs can be combined, and the names are going
back to what they were before:
- `Results` -> `EntryStates`
- `AnalysisAndResults` -> `Results`
The `state: A::Domain` value is the primary things that's modified when
performing an analysis. The `Analysis` impl is immutable in every case
but one (`MaybeRequiredStorage`) and it now uses interior mutability.
As well as changing many `&mut A` arguments to `&A`, this also:
- lets `CowMut` be replaced with the simpler `SimpleCow` in `cursor.rs`;
- removes the need for the `RefCell` in `Formatter`;
- removes the need for `MaybeBorrowedLocals` to impl `Clone`, because
it's a unit type and it's now clear that its constructor can be used
directly instead of being put into a local variable and cloned.