replace box_new with lower-level intrinsics
The `box_new` intrinsic is super special: during THIR construction it turns into an `ExprKind::Box` (formerly known as the `box` keyword), which then during MIR building turns into a special instruction sequence that invokes the exchange_malloc lang item (which has a name from a different time) and a special MIR statement to represent a shallowly-initialized `Box` (which raises [interesting opsem questions](https://github.com/rust-lang/rust/issues/97270)).
This PR is the n-th attempt to get rid of `box_new`. That's non-trivial because it usually causes a perf regression: replacing `box_new` by naive unsafe code will incur extra copies in debug builds, making the resulting binaries a lot slower, and will generate a lot more MIR, making compilation measurably slower. Furthermore, `vec!` is a macro, so the exact code it expands to is highly relevant for borrow checking, type inference, and temporary scopes.
To avoid those problems, this PR does its best to make the MIR almost exactly the same as what it was before. `box_new` is used in two places, `Box::new` and `vec!`:
- For `Box::new` that is fairly easy: the `move_by_value` intrinsic is basically all we need. However, to avoid the extra copy that would usually be generated for the argument of a function call, we need to special-case this intrinsic during MIR building. That's what the first commit does.
- `vec!` is a lot more tricky. As a macro, its details leak to stable code, so almost every variant I tried broke either type inference or the lifetimes of temporaries in some ui test or ended up accepting unsound code due to the borrow checker not enforcing all the constraints I hoped it would enforce. I ended up with a variant that involves a new intrinsic, `fn write_box_via_move<T>(b: Box<MaybeUninit<T>>, x: T) -> Box<MaybeUninit<T>>`, that writes a value into a `Box<MaybeUninit<T>>` and returns that box again. In exchange we can get rid of somewhat similar code in the lowering for `ExprKind::Box`, and the `exchange_malloc` lang item. (We can also get rid of `Rvalue::ShallowInitBox`; I didn't include that in this PR -- I think @cjgillot has a commit for this somewhere [around here](https://github.com/rust-lang/rust/pull/147862/commits).)
See [here](https://github.com/rust-lang/rust/pull/148190#issuecomment-3457454814) for the latest perf numbers. Most of the regressions are in deep-vector which consists entirely of an invocation of `vec!`, so any change to that macro affects this benchmark disproportionally.
This is my first time even looking at MIR building code, so I am very low confidence in that part of the patch, in particular when it comes to scopes and drops and things like that.
I also had do nerf some clippy tests because clippy gets confused by the new expansion of `vec!` so it makes fewer suggestions when `vec!` is involved.
### `vec!` FAQ
- Why does `write_box_via_move` return the `Box` again? Because we need to expand `vec!` to a bunch of method invocations without any blocks or let-statements, or else the temporary scopes (and type inference) don't work out.
- Why is `box_assume_init_into_vec_unsafe` (unsoundly!) a safe function? Because we can't use an unsafe block in `vec!` as that would necessarily also include the `$x` (due to it all being one big method invocation) and therefore interpret the user's code as being inside `unsafe`, which would be bad (and 10 years later, we still don't have safe blocks for macros like this).
- Why does `write_box_via_move` use `Box` as input/output type, and not, say, raw pointers? Because that is the only way to get the correct behavior when `$x` panics or has control effects: we need the `Box` to be dropped in that case. (As a nice side-effect this also makes the intrinsic safe, which is imported as explained in the previous bullet.)
- Can't we make it safe by having `write_box_via_move` return `Box<T>`? Yes we could, but there's no easy way for the intrinsic to convert its `Box<MaybeUninit<T>>` to a `Box<T>`. Transmuting would be unsound as the borrow checker would no longer properly enforce that lifetimes involved in a `vec!` invocation behave correctly.
- Is this macro truly cursed? Yes, yes it is.
const-eval: full support for pointer fragments
This fixes https://github.com/rust-lang/const-eval/issues/72 and makes `swap_nonoverlapping` fully work in const-eval by enhancing per-byte provenance tracking with tracking of *which* of the bytes of the pointer this one is. Later, if we see all the same bytes in the exact same order, we can treat it like a whole pointer again without ever risking a leak of the data bytes (that encode the offset into the allocation). This lifts the limitation that was discussed quite a bit in https://github.com/rust-lang/rust/pull/137280.
For a concrete piece of code that used to fail and now works properly consider this example doing a byte-for-byte memcpy in const without using intrinsics:
```rust
use std::{mem::{self, MaybeUninit}, ptr};
type Byte = MaybeUninit<u8>;
const unsafe fn memcpy(dst: *mut Byte, src: *const Byte, n: usize) {
let mut i = 0;
while i < n {
*dst.add(i) = *src.add(i);
i += 1;
}
}
const _MEMCPY: () = unsafe {
let ptr = &42;
let mut ptr2 = ptr::null::<i32>();
// Copy from ptr to ptr2.
memcpy(&mut ptr2 as *mut _ as *mut _, &ptr as *const _ as *const _, mem::size_of::<&i32>());
assert!(*ptr2 == 42);
};
```
What makes this code tricky is that pointers are "opaque blobs" in const-eval, we cannot just let people look at the individual bytes since *we don't know what those bytes look like* -- that depends on the absolute address the pointed-to object will be placed at. The code above "breaks apart" a pointer into individual bytes, and then puts them back together in the same order elsewhere. This PR implements the logic to properly track how those individual bytes relate to the original pointer, and to recognize when they are in the right order again.
We still reject constants where the final value contains a not-fully-put-together pointer: I have no idea how one could construct an LLVM global where one byte is defined as "the 3rd byte of a pointer to that other global over there" -- and even if LLVM supports this somehow, we can leave implementing that to a future PR. It seems unlikely to me anyone would even want this, but who knows.^^
This also changes the behavior of Miri, by tracking the order of bytes with provenance and only considering a pointer to have valid provenance if all bytes are in the original order again. This is related to https://github.com/rust-lang/unsafe-code-guidelines/issues/558. It means one cannot implement XOR linked lists with strict provenance any more, which is however only of theoretical interest. Practically I am curious if anyone will show up with any code that Miri now complains about - that would be interesting data. Cc `@rust-lang/opsem`
Add runtime check to avoid overwrite arg in `Diag`
## Origin PR description
At first, I set up a `debug_assert` check for the arg method to make sure that `args` in `Diag` aren't easily overwritten, and I added the `remove_arg()` method, so that if you do need to overwrite an arg, then you can explicitly call `remove_arg()` to remove it first, then call `arg()` to overwrite it.
For the code before the rust-lang/rust#142015 change, it won't compile because it will report an error
```
arg `instance`already exists.
```
This PR also modifies all diagnostics that fail the check to pass the check. There are two cases of check failure:
1. ~~Between *the parent diagnostic and the subdiagnostic*, or *between the subdiagnostics* have the same field between them. In this case, I renamed the conflicting fields.~~
2. ~~For subdiagnostics stored in `Vec`, the rendering may iteratively write the same arg over and over again. In this case, I changed the auto-generation with `derive(SubDiagnostic)` to manually implementing `SubDiagnostic` and manually rendered it with `eagerly_translate()`, similar to https://github.com/rust-lang/rust/issues/142031#issuecomment-2984812090, and after rendering it I manually deleted useless arg with the newly added `remove_arg` method.~~
## Final Decision
After trying and discussing, we made a final decision.
For `#[derive(Subdiagnostic)]`, This PR made two changes:
1. After the subdiagnostic is rendered, remove all args of this subdiagnostic, which allows for usage like `Vec<Subdiag>`.
2. Store `diag.args` before setting arguments, so that you can restore the contents of the main diagnostic after deleting the arguments after subdiagnostic is rendered, to avoid deleting the main diagnostic's arg when they have the same name args.
This will allow us to eagerly translate messages on a top-level
diagnostic, such as a `LintDiagnostic`. As a bonus, we can remove the
awkward closure passed into Subdiagnostic and make better use of
`Into`.
Previously, we included a redundant prefix on the panic message and a postfix of the location of the panic. The prefix didn't carry any additional information beyond "something failed", and the location of the panic is redundant with the diagnostic's span, which gets printed out even if its code is not shown.
```
error[E0080]: evaluation of constant value failed
--> $DIR/assert-type-intrinsics.rs:11:9
|
LL | MaybeUninit::<!>::uninit().assume_init();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ evaluation panicked: aborted execution: attempted to instantiate uninhabited type `!`
```
```
error[E0080]: evaluation of `Fail::<i32>::C` failed
--> $DIR/collect-in-dead-closure.rs:9:19
|
LL | const C: () = panic!();
| ^^^^^^^^ evaluation panicked: explicit panic
|
= note: this error originates in the macro
`$crate::panic::panic_2015` which comes from the expansion of the macro
`panic` (in Nightly builds, run with -Z macro-backtrace for more info)
```
```
error[E0080]: evaluation of constant value failed
--> $DIR/uninhabited.rs:41:9
|
LL | assert!(false);
| ^^^^^^^^^^^^^^ evaluation panicked: assertion failed: false
|
= note: this error originates in the macro `assert` (in Nightly builds, run with -Z macro-backtrace for more info)
```
---
When the primary span for a const error is the same as the first frame in the const error report, skip it.
```
error[E0080]: evaluation of constant value failed
--> $DIR/issue-88434-removal-index-should-be-less.rs:3:24
|
LL | const _CONST: &[u8] = &f(&[], |_| {});
| ^^^^^^^^^^^^^^ evaluation panicked: explicit panic
|
note: inside `f::<{closure@$DIR/issue-88434-removal-index-should-be-less.rs:3:31: 3:34}>`
--> $DIR/issue-88434-removal-index-should-be-less.rs:10:5
|
LL | panic!()
| ^^^^^^^^ the failure occurred here
= note: this error originates in the macro `$crate::panic::panic_2015` which comes from the expansion of the macro `panic` (in Nightly builds, run with -Z macro-backtrace for more info)
```
instead of
```
error[E0080]: evaluation of constant value failed
--> $DIR/issue-88434-removal-index-should-be-less.rs:10:5
|
LL | panic!()
| ^^^^^^^^ explicit panic
|
note: inside `f::<{closure@$DIR/issue-88434-removal-index-should-be-less.rs:3:31: 3:34}>`
--> $DIR/issue-88434-removal-index-should-be-less.rs:10:5
|
LL | panic!()
| ^^^^^^^^
note: inside `_CONST`
--> $DIR/issue-88434-removal-index-should-be-less.rs:3:24
|
LL | const _CONST: &[u8] = &f(&[], |_| {});
| ^^^^^^^^^^^^^^
= note: this error originates in the macro `$crate::panic::panic_2015` which comes from the expansion of the macro `panic` (in Nightly builds, run with -Z macro-backtrace for more info)
```
---
Revert order of constant evaluation errors
Point at the code the user wrote first and std functions last.
```
error[E0080]: evaluation of constant value failed
--> $DIR/const-errs-dont-conflict-103369.rs:5:25
|
LL | impl ConstGenericTrait<{my_fn(1)}> for () {}
| ^^^^^^^^ evaluation panicked: Some error occurred
|
note: called from `my_fn`
--> $DIR/const-errs-dont-conflict-103369.rs:10:5
|
LL | panic!("Some error occurred");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: this error originates in the macro `$crate::panic::panic_2015` which comes from the expansion of the macro `panic` (in Nightly builds, run with -Z macro-backtrace for more info)
```
instead of
```
error[E0080]: evaluation of constant value failed
--> $DIR/const-errs-dont-conflict-103369.rs:10:5
|
LL | panic!("Some error occurred");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Some error occurred
|
note: called from `<() as ConstGenericTrait<{my_fn(1)}>>::{constant#0}`
--> $DIR/const-errs-dont-conflict-103369.rs:5:25
|
LL | impl ConstGenericTrait<{my_fn(1)}> for () {}
| ^^^^^^^^
= note: this error originates in the macro `$crate::panic::panic_2015` which comes from the expansion of the macro `panic` (in Nightly builds, run with -Z macro-backtrace for more info)
```
Make it so that every structured error annotated with `#[derive(Diagnostic)]` that has a field of type `Ty<'_>`, the printing of that value into a `String` will look at the thread-local storage `TyCtxt` in order to shorten to a length appropriate with the terminal width. When this happen, the resulting error will have a note with the file where the full type name was written to.
```
error[E0618]: expected function, found `((..., ..., ..., ...), ..., ..., ...)``
--> long.rs:7:5
|
6 | fn foo(x: D) { //~ `x` has type `(...
| - `x` has type `((..., ..., ..., ...), ..., ..., ...)`
7 | x(); //~ ERROR expected function, found `(...
| ^--
| |
| call expression requires function
|
= note: the full name for the type has been written to 'long.long-type-14182675702747116984.txt'
= note: consider using `--verbose` to print the full type name to the console
```
We cannot produce anything useful if asked to compile unknown targets.
We should handle the error immediately at the point of discovery instead
of propagating it upward, and preferably in the simplest way: Die.
This allows cleaning up our "error-handling" spread across 5 crates.
```
error: `size_of_val` is not yet stable as a const intrinsic
--> $DIR/const-unstable-intrinsic.rs:17:9
|
LL | unstable_intrinsic::size_of_val(&x);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: add `#![feature(unstable)]` to the crate attributes to enable
help: add `#![feature(unstable)]` to the crate attributes to enable
|
LL + #![feature("unstable")]
|
```