Mirrors/rust - rust - Gitea @ Femelysm.ru

mirror of https://github.com/rust-lang/rust.git synced 2026-05-31 21:47:15 +03:00

Author	SHA1	Message	Date
Stephen	8477a690b1	add support for Zprint-codegen-stats-json	2026-05-11 16:28:51 -04:00
Devon Loehr	7f697083d6	Fix formatting	2026-05-05 14:34:16 +00:00
Devon Loehr	412090247c	Narrow definitions	2026-05-05 14:18:59 +00:00
Devon Loehr	596d9853bb	Adjust getMCSubtargetInfo signature for LLVM 23+	2026-05-04 18:30:23 +00:00
Jonathan Brouwer	c6912bf401	Rollup merge of #155692 - fneddy:fix_naked-dead-code-elimination, r=folkertdev disable naked-dead-code-elimination test if no RET mnemonic is available this test emit x86_64 specific ret asm instruction and should not be compiled on any other arch.	2026-04-28 20:24:33 +02:00
Eddy (Eduard) Stefes	2a8e588c90	Add `--print=backend-has-mnemonic` and `needs-asm-mnemonic` directive Add infrastructure to query LLVM backend for specific assembly mnemonics and use it in compiletest to conditionally run tests based on instruction availability. This fixes test failures with naked-dead-code-elimination which requires the `RET` mnemonic. Co-authored-by: Folkert de Vries <flokkievids@gmail.com>	2026-04-28 10:21:15 +02:00
Jonathan Brouwer	dde4886801	Rollup merge of #146181 - Flakebi:dynamic-shared-memory, r=ZuseZ4,Sa4dus,workingjubilee,RalfJung,nikic,kjetilkjeka,kulst Add intrinsic for launch-sized workgroup memory on GPUs Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now. Tracking issue: rust-lang/rust#135516	2026-04-25 23:07:48 +02:00
Flakebi	13ec3de673	Add intrinsic for launch-sized workgroup memory on GPUs Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now.	2026-04-24 10:03:45 +02:00
Augie Fackler	f6b8f0b6f1	rustc_llvm: update opt-level handling for LLVM 23 LLVM 23 removed Os and Oz optimization pipelines and the PR says to use O2 with optsize or minsize instead as appropriate.	2026-04-22 13:25:35 -04:00
sayantn	a5372be2a1	Add target arch verification for LLVM intrinsics	2026-04-12 23:33:27 +05:30
sayantn	c21f4ee437	Check for AutoUpgraded intrinsics, and lint on uses of deprecated intrinsics	2026-04-12 23:33:15 +05:30
Jonathan Brouwer	66a00ba2ef	Rollup merge of #153995 - Flakebi:gpu-use-convergent, r=nnethercote Use convergent attribute to funcs for GPU targets On targets with convergent operations, we need to add the convergent attribute to all functions that run convergent operations. Following clang, we can conservatively apply the attribute to all functions when compiling for such a target and rely on LLVM optimizing away the attribute in cases where it is not necessary. This affects the amdgpu and nvptx targets. cc @kjetilkjeka, @kulst for nvptx cc @ZuseZ4 r? @nnethercote, as you already reviewed this in the other PR Split out from rust-lang/rust#149637, the part here should be uncontroversial.	2026-04-08 14:21:57 +02:00
David Wood	a24ee0329e	cg_llvm/debuginfo: scalable vectors Generate debuginfo for scalable vectors, following the structure that Clang generates for scalable vectors.	2026-04-03 10:37:42 +00:00
Jakub Beránek	8b44562bc8	Revert "Rollup merge of #154200 - resrever:enable-dwarf-call-sites, r=dingxiangfei2009" This reverts commit `2f1603077b`, reversing changes made to `6e3c17424d`.	2026-03-27 20:08:24 +01:00
Jonathan Brouwer	2f1603077b	Rollup merge of #154200 - resrever:enable-dwarf-call-sites, r=dingxiangfei2009 debuginfo: emit DW_TAG_call_site entries Set `FlagAllCallsDescribed` on function definition DIEs so LLVM emits DW_TAG_call_site entries, letting debuggers and analysis tools track tail calls.	2026-03-25 19:52:50 +01:00
Scott Young	9677d7a587	debuginfo: emit DW_TAG_call_site entries	2026-03-22 08:42:21 -04:00
Alice Ryhl	a197752e88	Add kernel-hwaddress sanitizer Signed-off-by: Alice Ryhl <aliceryhl@google.com>	2026-03-17 20:23:59 +00:00
Flakebi	8e932ed79c	Use convergent attribute to funcs for GPU targets On targets with convergent operations, we need to add the convergent attribute to all functions that run convergent operations. Following clang, we can conservatively apply the attribute to all functions when compiling for such a target and rely on LLVM optimizing away the attribute in cases where it is not necessary. This affects the amdgpu and nvptx targets.	2026-03-17 10:51:31 +01:00
Ralf Jung	c7220f423b	rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering	2026-03-15 14:53:00 +01:00
Josh Stone	52dfa94cdc	Update the minimum external LLVM to 21	2026-03-12 16:45:42 -07:00
Stuart Cook	cc0a60fd74	Rollup merge of #153446 - bjorn3:llvm_pre_link_thinlto, r=cuviper Always use the ThinLTO pipeline for pre-link optimizations When using cargo this was already effectively done for all dependencies as cargo passes -Clinker-plugin-lto without -Clto=fat/thin. -Clinker-plugin-lto assumes that ThinLTO will be used. The ThinLTO pre-link pipeline is faster than the fat LTO one. And according to the benchmarks in [^1] there is barely any runtime performance difference between executables that used fat LTO with the fat vs ThinLTO pre-link pipeline. This also helps avoid having yet another code path if we want to support Unified LTO (that is a single bitcode file that supports being used for both fat LTO and ThinLTO when using linker plugin LTO, we already support it when rustc does LTO as ThinLTO bitcode is enough of a superset of fat LTO bitcode that it happens to work by accident if you don't explicitly have a check preventing mixing of them for the current set of LTO features that rustc exposes.) I'm currently still investigating if rustc would benefit from Unified LTO and how exactly to integrate it. [^1]: https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774	2026-03-08 14:01:35 +11:00
bjorn3	71a31b30d9	Always use the ThinLTO pipeline for pre-link optimizations When using cargo this was already effectively done for all dependencies as cargo passes -Clinker-plugin-lto without -Clto=fat/thin. -Clinker-plugin-lto assumes that ThinLTO will be used. The ThinLTO pre-link pipeline is faster than the fat LTO one. And according to the benchmarks in [1] there is barely any runtime performance difference between executables that used fat LTO with the fat vs ThinLTO pre-link pipeline. [1]: https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774	2026-03-05 17:40:58 +00:00
Daniel Paoliello	614bac581b	[win] Fix truncated unwinds for Arm64 Windows	2026-02-27 14:53:09 -08:00
bjorn3	474a7168ab	Remove explicit EmitThinLTOSummary argument In favor of passing a NULL ThinLTOSummaryBufferRef. And improve type improve type safety on the Rust side.	2026-02-21 11:47:45 +00:00
bjorn3	a086b3617e	Remove ModuleBuffer ThinBuffer duplication	2026-02-21 11:47:45 +00:00
bjorn3	a5372d1dba	Replace LLVMRustThinLTOBuffer with separate LLVMRustBuffers for bitcode and summary	2026-02-21 11:47:45 +00:00
bjorn3	8b2c10ff82	Replace LLVMRustModuleBuffer with generic LLVMRustBuffer	2026-02-21 11:47:45 +00:00
bjorn3	c51cd0e691	Deduplicate some code in LLVMRustOptimize	2026-02-20 12:19:41 +00:00
bjorn3	6366a698e3	Remove -Zemit-thin-lto flag As far as I can tell it was introduced to allow fat LTO with -Clinker-plugin-lto. Later a change was made to automatically disable ThinLTO summary generation when -Clinker-plugin-lto -Clto=fat is used, so we can safely remove it.	2026-02-20 12:19:41 +00:00
Manuel Drehwald	c89a89bb14	Fix multi-cgu+debug builds using autodiff by delaying autodiff till lto	2026-02-11 14:08:56 -05:00
Jonathan Brouwer	dec8d6ebcf	Rollup merge of #150780 - fzakaria:fzakaria/section-threshold, r=jackh726 Add -Z large-data-threshold This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64. When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB. This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.	2026-01-23 11:07:55 +01:00
Matthew Maurer	b639b0a4d8	llvm: Tolerate dead_on_return attribute changes The attribute now has a size parameter and sorts differently: * Explicitly omit size parameter during construction on 23+ * Tolerate alternate sorting in tests https://github.com/llvm/llvm-project/pull/171712	2026-01-21 23:39:03 +00:00
Nikita Popov	0be66603ac	Avoid passing addrspacecast to lifetime intrinsics Since LLVM 22 the alloca must be passed directly. Do this by stripping the addrspacecast if it exists.	2026-01-20 14:47:04 +01:00
Marcelo Domínguez	307a4fcdf8	Add scalar support for both host and device	2026-01-19 22:28:42 +01:00
Farid Zakaria	93f2e80f4a	Add -Z large-data-threshold This flag allows specifying the threshold size for placing static data in large data sections when using the medium code model on x86-64. When using -Ccode-model=medium, data smaller than this threshold uses RIP-relative addressing (32-bit offsets), while larger data uses absolute 64-bit addressing. This allows the compiler to generate more efficient code for smaller data while still supporting data larger than 2GB. This mirrors the -mlarge-data-threshold flag available in GCC and Clang. The default threshold is 65536 bytes (64KB) if not specified, matching LLVM's default behavior.	2026-01-07 11:57:48 -08:00
Jonathan Brouwer	d898dccc21	Rollup merge of #150511 - Sa4dUs:offload-inline, r=ZuseZ4 Allow inline calls to offload intrinsic Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block. r? `@ZuseZ4` fixes: https://github.com/rust-lang/rust/issues/150413	2025-12-31 14:30:48 +01:00
Marcelo Domínguez	9d8b4cc70d	Restore builder at the end of saved bb	2025-12-31 13:10:29 +01:00
Jonathan Brouwer	122f02ad02	Rollup merge of #150394 - DKLoehr:passplugin, r=nikic Accommodate LLVM PassPlugin rename LLVM [recently moved](https://github.com/llvm/llvm-project/pull/173279) their `PassPlugin` files to a new folder. This PR updates our `PassWrapper` to point to the new location.	2025-12-29 17:17:56 +01:00
dianqk	fe075ad212	Removes the serde dependency in rustc_codegen_llvm	2025-12-28 15:52:20 +08:00
Devon Loehr	634251cba8	Accommodate upstream PassPlugin rename	2025-12-26 15:40:40 +00:00
Manuel Drehwald	dfef2e96fe	Remove the need to call clang for std::offload usages	2025-12-23 05:20:07 -08:00
sgasho	ddd5aad8a3	feat: dlopen Enzyme	2025-12-16 00:31:32 +09:00
Alina Sbirlea	ad73972e99	Fix for LLVM22 making lowering decisions dependent on RuntimeLibraryInfo. LLVM reference commit: https://github.com/llvm/llvm-project/commit/04c81a99735c04b2018eeb687e74f9860e1d0e1b.	2025-12-04 20:23:00 +00:00
Stuart Cook	2b150f2c65	Rollup merge of #147936 - Sa4dUs:offload-intrinsic, r=ZuseZ4 Offload intrinsic This PR implements the minimal mechanisms required to run a small subset of arbitrary offload kernels without relying on hardcoded names or metadata. - `offload(kernel, (..args))`: an intrinsic that generates the necessary host-side LLVM-IR code. - `rustc_offload_kernel`: a builtin attribute that marks device kernels to be handled appropriately. Example usage (pseudocode): ```rust fn kernel(x: mut [f64; 128]) { core::intrinsics::offload(kernel_1, (x,)) } #[cfg(target_os = "linux")] extern "C" { pub fn kernel_1(array_b: mut [f64; 128]); } #[cfg(not(target_os = "linux"))] #[rustc_offload_kernel] extern "gpu-kernel" fn kernel_1(x: mut [f64; 128]) { unsafe { (x)[0] = 21.0 }; } ```	2025-11-26 23:32:03 +11:00
Marcelo Domínguez	5128ce10a0	Implement offload intrinsic	2025-11-25 20:04:27 +01:00
Manuel Drehwald	5fbe5dae42	Only try to link against offload functions if llvm.enzyme is enabled	2025-11-23 00:19:53 -08:00
Manuel Drehwald	89d50591c0	Replace the first of 4 binary invocations for offload	2025-11-21 02:41:17 -08:00
Quinn Okabayashi	c7e50d0f37	Remove unused LLVMModuleRef argument	2025-11-12 15:46:08 +00:00
bors	87f9dcd5e2	Auto merge of #147935 - luca3s:add-rtsan, r=petrochenkov Add LLVM realtime sanitizer This is a new attempt at adding the [LLVM real-time sanitizer](https://clang.llvm.org/docs/RealtimeSanitizer.html) to rust. Previously this was attempted in https://github.com/rust-lang/rfcs/pull/3766. Since then the `sanitize` attribute was introduced in https://github.com/rust-lang/rust/pull/142681 and it is a lot more flexible than the old `no_santize` attribute. This allows adding real-time sanitizer without the need for a new attribute, like it was proposed in the RFC. Because i only add a new value to a existing command line flag and to a attribute i don't think an MCP is necessary. Currently real-time santizer is usable in rust code with the [rtsan-standalone](https://crates.io/crates/rtsan-standalone) crate. This downloads or builds the sanitizer runtime and then links it into the rust binary. The first commit adds support for more detailed sanitizer information. The second commit then actually adds real-time sanitizer. The third adds a warning against using real-time sanitizer with async functions, cloures and blocks because it doesn't behave as expected when used with async functions. I am not sure if this is actually wanted, so i kept it in a seperate commit. The fourth commit adds the documentation for real-time sanitizer.	2025-11-08 12:24:15 +00:00
Lucas Baumann	d198633b95	add realtime sanitizer	2025-11-06 13:20:12 +01:00

1 2 3 4 5 ...

610 Commits