Mirrors/rust - rust - Gitea @ Femelysm.ru

mirror of https://github.com/rust-lang/rust.git synced 2026-06-01 14:10:03 +03:00

Author	SHA1	Message	Date
Ralf Jung	e402c1edf0	miri: remove retag statements, make typed copies retag implicitly instead	2026-05-02 17:40:33 +02:00
Jonathan Brouwer	716bccea07	Rollup merge of #156028 - scottmcm:local-arg, r=wesleywiser Add a `Local::arg(i)` helper constructor While reading through stuff I was noticing just how many `+1` fixes there were in various places (and comments explaining those fixups), so this adds a new inherent helper on `Local` for making an argument to help make this clearer. r? mir	2026-05-02 10:18:28 +02:00
Scott McMurray	548cdc1412	Add a `Local::arg(i)` helper constructor While reading through stuff I was noticing just how many `+1` fixes there were in various places (and comments explaining that fixup), so this adds a new inherent helper on `Local` for making an argument to help make this clearer.	2026-04-30 20:52:02 -07:00
Jacob Pratt	0ab1a47246	Rollup merge of #155821 - folkertdev:doc-va-list-clone, r=joshtriplett c-variadic: document `Clone` and `Drop` instances and require `VaArgSafe: Copy` tracking issue: https://github.com/rust-lang/rust/issues/44930 Fixing some things that came up in the stabilization PR r? tgross35 cc @kpreid	2026-04-30 22:28:32 -04:00
Folkert de Vries	ac12c696b8	remove custom `va_end` implementation in the LLVM backend it should use the fallback body instead	2026-04-30 20:32:15 +02:00
Hood Chatham	b072d24e26	Fix: On wasm targets, call `panic_in_cleanup` if panic occurs in cleanup Previously this was not correctly implemented. Each funclet may need its own terminate block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which can have a terminate_block for each funclet. We key on the first basic block of the funclet -- in particular, this is the start block for the old case of the top level terminate function. Rather than using a catchswitch/catchpad pair, I used a cleanuppad. The reason for the pair is to avoid catching foreign exceptions on MSVC. On wasm, it seems that the catchswitch/catchpad pair is optimized back into a single cleanuppad and a catch_all instruction is emitted which will catch foreign exceptions. Because the new logic is only used on wasm, it seemed better to take the simpler approach seeing as they do the same thing.	2026-04-28 09:56:22 -07:00
Jonathan Brouwer	dde4886801	Rollup merge of #146181 - Flakebi:dynamic-shared-memory, r=ZuseZ4,Sa4dus,workingjubilee,RalfJung,nikic,kjetilkjeka,kulst Add intrinsic for launch-sized workgroup memory on GPUs Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now. Tracking issue: rust-lang/rust#135516	2026-04-25 23:07:48 +02:00
Flakebi	13ec3de673	Add intrinsic for launch-sized workgroup memory on GPUs Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now.	2026-04-24 10:03:45 +02:00
bors	f676c20edd	Auto merge of #155343 - dianqk:indirect-by-ref, r=nikic codegen: Copy to an alloca when the argument is neither by-val nor by-move for indirect pointer. Fixes https://github.com/rust-lang/rust/issues/155241. When a value is passed via an indirect pointer, the value needs to be copied to a new alloca. For x86_64-unknown-linux-gnu, `Thing` is the case: ```rust #[derive(Clone, Copy)] struct Thing(usize, usize, usize); pub fn foo() { let thing = Thing(0, 0, 0); bar(thing); assert_eq!(thing.0, 0); } #[inline(never)] #[unsafe(no_mangle)] pub fn bar(mut thing: Thing) { thing.0 = 1; } ``` Before passing the thing to the bar function, the thing needs to be copied to an alloca that is passed to bar. ```llvm %0 = alloca [24 x i8], align 8 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %0, ptr align 8 %thing, i64 24, i1 false) call void @bar(ptr %0) ``` This patch applies the rule to the untupled arguments as well. ```rust #![feature(fn_traits)] #[derive(Clone, Copy)] struct Thing(usize, usize, usize); #[inline(never)] #[unsafe(no_mangle)] pub fn foo() { let thing = (Thing(0, 0, 0),); (\|mut thing: Thing\| { thing.0 = 1; }).call(thing); assert_eq!(thing.0.0, 0); } ``` For this case, this patch changes from ```llvm ; call example::foo::{closure#0} call void @_RNCNvCs15qdZVLwHPA_7example3foo0B3_(ptr ..., ptr %thing) ``` to ```llvm %0 = alloca [24 x i8], align 8 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %0, ptr align 8 %thing, i64 24, i1 false) ; call example::foo::{closure#0} call void @_RNCNvCs15qdZVLwHPA_7example3foo0B3_(ptr ..., ptr %0) ``` However, the same rule cannot be applied to tail calls that would be unsound, because the caller's stack frame is overwritten by the callee's stack frame. Fortunately, https://github.com/rust-lang/rust/pull/151143 has already handled the special case. We must not copy again. No copy is needed for by-move arguments, because the argument is passed to the called "in-place". No copy is also needed for by-val arguments, because the attribute implies that a hidden copy of the pointee is made between the caller and the callee. NOTE: The patch has a trick for tail calls that we pass by-move. We can choose to copy an alloca even for by-move arguments, but tail calls require MUST-by-move.	2026-04-22 15:47:21 +00:00
dianqk	10d8329061	codegen: Copy to an alloca when the argument is neither by-val nor by-move for indirect pointer.	2026-04-22 17:37:17 +08:00
Folkert de Vries	41afd5f8d6	handle `uefi` and test assembly versus regular functions	2026-04-16 11:37:46 +02:00
Folkert de Vries	7787bd915b	fix macho section specifier & windows test	2026-04-16 11:37:46 +02:00
Folkert de Vries	bc4aad37ca	naked-functions: properly document the -Zfunction-sections windows status	2026-04-16 11:37:45 +02:00
Folkert de Vries	872301bfdd	naked functions: respect `function_sections` on linux/macos	2026-04-16 11:37:45 +02:00
Folkert de Vries	0e522d6c62	naked functions: respect `function_sections` on windows For `gnu` function_sections is off by default.	2026-04-16 11:37:45 +02:00
Jacob Pratt	803a7227fb	Rollup merge of #155005 - folkertdev:simd-element-type-llvm, r=nnethercote preserve SIMD element type information Preserve the SIMD element type and provide it to LLVM for better optimization. This is relevant for AArch64 types like `int16x4x2_t`, see also https://github.com/llvm/llvm-project/issues/181514. Such types are defined like so: ```rust #[repr(simd)] struct int16x4_t([i16; 4]); #[repr(C)] struct int16x4x2_t(pub int16x4_t, pub int16x4_t); ``` Previously this would be translated to the opaque `[2 x <8 x i8>]`, with this PR it is instead `[2 x <4 x i16>]`. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher. This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support. discussion at [#t-compiler > loss of vector element type information](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/loss.20of.20vector.20element.20type.20information/with/584483611)	2026-04-14 00:37:24 -04:00
Folkert de Vries	6f428df8df	preseve SIMD element type information and provide it to LLVM for better optimization	2026-04-13 13:26:50 +02:00
David Wood	da948999eb	cg_ssa: transmute between scalable vectors Like regular SIMD vectors, we can support casting between scalable vectors of integral or floating-point types without needing a temporary.	2026-04-13 04:24:28 +00:00
Jonathan Brouwer	b040d5493e	Rollup merge of #154598 - folkertdev:windows-naked-link-section, r=mati865 test `#[naked]` with `#[link_section = "..."]` on windows As a part of https://github.com/rust-lang/rust/pull/147811 I ran into that we actually don't match (current) LLVM output. r? @mati865	2026-04-08 23:04:33 +02:00
Nicholas Nethercote	3ff4201fd1	Move `rustc_middle::mir::mono` to `rustc_middle::mono` Because the things in this module aren't MIR and don't use anything from `rustc_middle::mir`. Also, modules that use `mono` often don't use anything else from `rustc_middle::mir`.	2026-04-07 08:33:54 +10:00
Folkert de Vries	72b6825828	test `#[naked]` with `#[link_section = "..."]` on windows	2026-04-06 14:58:52 +02:00
David Wood	a2f7f3c1eb	ty_utils: lower tuples to `ScalableVector` repr Instead of just using regular struct lowering for these types, which results in an incorrect ABI (e.g. returning indirectly), use `BackendRepr::ScalableVector` which will lower to the correct type and be passed in registers. This also enables some simplifications for generating alloca of scalable vectors and greater re-use of `scalable_vector_parts`. A LLVM codegen test demonstrating the changed IR this generates is included in the next commit alongside some intrinsics that make these tuples usable.	2026-04-03 10:27:30 +00:00
Wesley Wiser	c9d3a00cd1	Revert "Fix: On wasm targets, call `panic_in_cleanup` if panic occurs in cleanup" This reverts commit `acbfd79acf`.	2026-04-01 21:29:42 -05:00
teor	55d9f7cb6c	Fix typos and outdated comments	2026-03-20 11:22:53 +10:00
bors	b2fabe39bd	Auto merge of #153673 - JonathanBrouwer:rollup-cGOKonI, r=JonathanBrouwer Rollup of 7 pull requests Successful merges: - rust-lang/rust#153560 (Introduce granular tidy_ctx's check in extra_checks) - rust-lang/rust#153666 (Add a regression test for rust-lang/rust#153599) - rust-lang/rust#153493 (Remove `FromCycleError` trait) - rust-lang/rust#153549 (tests/ui/binop: add annotations for reference rules) - rust-lang/rust#153641 (Move `Spanned`.) - rust-lang/rust#153663 (Remove `TyCtxt::node_lint` method and `rustc_middle::lint_level` function) - rust-lang/rust#153664 (Add test for rust-lang/rust#109804)	2026-03-11 05:12:10 +00:00
Jonathan Brouwer	3ed43bb774	Rollup merge of #153663 - GuillaumeGomez:migrate-diag, r=JonathanBrouwer Remove `TyCtxt::node_lint` method and `rustc_middle::lint_level` function Part of https://github.com/rust-lang/rust/issues/153099. With this PR, we can finally get rid of `lint_level`. \o/ r? @JonathanBrouwer	2026-03-10 22:46:57 +01:00
Nicholas Nethercote	c12ab08c14	Move `Spanned`. It's defined in `rustc_span::source_map` which doesn't make any sense because it has nothing to do with source maps. This commit moves it to the crate root, a more sensible spot for something this basic.	2026-03-11 06:25:23 +11:00
Guillaume Gomez	916d760c47	Replace `TyCtxt::node_lint` call with `TyCtxt::emit_node_lint` in `rustc_codegen_ssa`	2026-03-10 17:14:01 +01:00
David Wood	db5e2dc248	abi: s/ScalableVector/SimdScalableVector Renaming to remove any ambiguity as to what "vector" refers to in this context	2026-03-10 11:52:22 +00:00
bors	64b72a1fa5	Auto merge of #150447 - WaffleLapkin:maybe-dangling-semantics, r=RalfJung Implement `MaybeDangling` compiler support Tracking issue: https://github.com/rust-lang/rust/issues/118166 cc @RalfJung	2026-03-05 12:21:27 +00:00
Waffle Lapkin	312055fad5	refactor `PointeeInfo` Make `size`/`align` always correct rather than conditionally on the `safe` field. This makes it less error prone and easier to work with for `MaybeDangling` / potential future pointer kinds like `Aligned<_>`.	2026-03-05 11:53:38 +01:00
Folkert de Vries	391a7554f3	enable `PassMode::Indirect { on_stack: true }` tail call arguments	2026-03-04 19:43:12 +01:00
Jonathan Brouwer	ad4b2c01a1	Rollup merge of #153046 - bjorn3:cg_ssa_cleanups, r=TaKO8Ki Couple of cg_ssa refactorings These should help a bit with using cg_ssa in cg_clif at some point in the future.	2026-03-02 09:49:22 +01:00
Folkert de Vries	e6cf5a22e7	test u128 passing on linux and windows	2026-02-27 10:51:55 +01:00
Folkert de Vries	31ae3d2be8	guaranteed tail calls: support indirect arguments	2026-02-27 10:24:39 +01:00
Jacob Pratt	cb78bc4dd4	Rollup merge of #151771 - hoodmane:wasm-double-panic, r=bjorn3 Fix: On wasm targets, call `panic_in_cleanup` if panic occurs in cleanup Previously this was not correctly implemented. Each funclet may need its own terminate block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which can have a terminate_block for each funclet. We key on the first basic block of the funclet -- in particular, this is the start block for the old case of the top level terminate function. I also fixed the `terminate` handler to not be invoked when a foreign exception is raised, mimicking the behavior from msvc. On wasm, in order to avoid generating a `catch_all` we need to call `llvm.wasm.get.exception` and `llvm.wasm.get.ehselector`.	2026-02-25 21:42:53 -05:00
bjorn3	df4b228c71	Merge const_data_from_alloc into static_addr_of In Cranelift a Value can't hold arbitrarily sized values.	2026-02-25 11:11:06 +00:00
Hood Chatham	acbfd79acf	Fix: On wasm targets, call `panic_in_cleanup` if panic occurs in cleanup Previously this was not correctly implemented. Each funclet may need its own terminate block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which can have a terminate_block for each funclet. We key on the first basic block of the funclet -- in particular, this is the start block for the old case of the top level terminate function. Rather than using a catchswitch/catchpad pair, I used a cleanuppad. The reason for the pair is to avoid catching foreign exceptions on MSVC. On wasm, it seems that the catchswitch/catchpad pair is optimized back into a single cleanuppad and a catch_all instruction is emitted which will catch foreign exceptions. Because the new logic is only used on wasm, it seemed better to take the simpler approach seeing as they do the same thing.	2026-02-24 17:47:27 +01:00
Scott McMurray	0187acccad	Include the `Align::MAX` limit in the `!range` metadata for loading alignment from a vtable	2026-02-20 20:22:31 -08:00
Camille Gillot	6d4b1b38e7	Remove ShallowInitBox.	2026-02-17 11:25:50 +00:00
Hans Wennborg	73d06be9f3	Set hidden visibility on naked functions in compiler-builtins `88b46460fa` made builtin functions hidden, but it doesn't apply to naked functions, which are generated through a different code path.	2026-02-02 14:52:34 +01:00
Matthias Krüger	3a69035338	Rollup merge of #151346 - folkertdev:simd-splat, r=workingjubilee add `simd_splat` intrinsic Add `simd_splat` which lowers to the LLVM canonical splat sequence. ```llvm insertelement <N x elem> poison, elem %x, i32 0 shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer ``` Right now we try to fake it using one of ```rust fn splat(x: u32) -> u32x8 { u32x8::from_array([x; 8]) } ``` or (in `stdarch`) ```rust fn splat(value: $elem_type) -> $name { #[derive(Copy, Clone)] #[repr(simd)] struct JustOne([$elem_type; 1]); let one = JustOne([value]); // SAFETY: 0 is always in-bounds because we're shuffling // a simd type with exactly one element. unsafe { simd_shuffle!(one, one, [0; $len]) } } ``` Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples: - https://github.com/rust-lang/rust/issues/60637 - https://github.com/rust-lang/rust/issues/137407 - https://github.com/rust-lang/rust/issues/122623 - https://github.com/rust-lang/rust/issues/97804 --- As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends. Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below. Currently this just adds the intrinsic, it does not actually use it anywhere yet.	2026-01-24 21:04:15 +01:00
Folkert de Vries	71f34429ac	const-eval: do not call `immediate_const_vector` on vector of pointers	2026-01-24 10:40:47 +01:00
Ralf Jung	29ed211215	codegen: clarify some variable names around function calls	2026-01-21 18:01:30 +01:00
Jacob Pratt	6912c676cd	Rollup merge of #150607 - dispatch-ptr-intrinsic, r=workingjubilee Add amdgpu_dispatch_ptr intrinsic There is an ongoing discussion in rust-lang/rust#150452 about using address spaces from the Rust language in some way. As that discussion will likely not conclude soon, this PR adds one rustc_intrinsic with an addrspacecast to unblock getting basic information like launch and workgroup size and make it possible to implement something like `core::gpu`. Add a rustc intrinsic `amdgpu_dispatch_ptr` to access the kernel dispatch packet on amdgpu. The HSA kernel dispatch packet contains important information like the launch size and workgroup size. The Rust intrinsic lowers to the `llvm.amdgcn.dispatch.ptr` LLVM intrinsic, which returns a `ptr addrspace(4)`, plus an addrspacecast to `addrspace(0)`, so it can be returned as a Rust reference. The returned pointer/reference is valid for the whole program lifetime, and is therefore `'static`. The return type of the intrinsic (`&'static ()`) does not mention the struct so that rustc does not need to know the exact struct type. An alternative would be to define the struct as lang item or add a generic argument to the function. Is this ok or is there a better way (also, should it return a pointer instead of a reference)? Short version: ```rust #[cfg(target_arch = "amdgpu")] pub fn amdgpu_dispatch_ptr() -> *const (); ``` Tracking issue: rust-lang/rust#135024	2026-01-15 19:35:46 -05:00
Flakebi	91d4e40e02	Add amdgpu_dispatch_ptr intrinsic Add a rustc intrinsic `amdgpu_dispatch_ptr` to access the kernel dispatch packet on amdgpu. The HSA kernel dispatch packet contains important information like the launch size and workgroup size. The Rust intrinsic lowers to the `llvm.amdgcn.dispatch.ptr` LLVM intrinsic, which returns a `ptr addrspace(4)`, plus an addrspacecast to `addrspace(0)`, so it can be returned as a Rust reference. The returned pointer/reference is valid for the whole program lifetime, and is therefore `'static`. The return type of the intrinsic (`const ()`) does not mention the struct so that rustc does not need to know the exact struct type. An alternative would be to define the struct as lang item or add a generic argument to the function. Short version: ```rust #[cfg(target_arch = "amdgpu")] pub fn amdgpu_dispatch_ptr() -> const (); ```	2026-01-09 10:41:37 +01:00
Folkert de Vries	76d0843f8d	naked functions: emit `.private_extern` on macos	2026-01-06 16:48:04 +01:00
Fang He	f9007bcb87	fix condition checks for SVE <vscale x N x i1> when N != 16	2025-12-28 14:59:40 +08:00
Fang He	435a027c71	fix typo and rephrase	2025-12-28 14:59:08 +08:00
bjorn3	cb23b54eb1	Add a hack for llvm.wasm.throw As this is the only unwinding intrinsic we use and codegen_llvm_intrinsic_call currently doesn't handle unwinding intrinsics, this uses the conventional call path for llvm.wasm.throw.	2025-12-27 17:46:26 +00:00

1 2 3 4 5 ...

1097 Commits