Mirrors/rust - rust - Gitea @ Femelysm.ru

mirror of https://github.com/rust-lang/rust.git synced 2026-04-26 13:01:27 +03:00

Author	SHA1	Message	Date
Flakebi	13ec3de673	Add intrinsic for launch-sized workgroup memory on GPUs Workgroup memory is a memory region that is shared between all threads in a workgroup on GPUs. Workgroup memory can be allocated statically or after compilation, when launching a gpu-kernel. The intrinsic added here returns the pointer to the memory that is allocated at launch-time. # Interface With this change, workgroup memory can be accessed in Rust by calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T` intrinsic. It returns the pointer to workgroup memory guaranteeing that it is aligned to at least the alignment of `T`. The pointer is dereferencable for the size specified when launching the current gpu-kernel (which may be the size of `T` but can also be larger or smaller or zero). All calls to this intrinsic return a pointer to the same address. See the intrinsic documentation for more details. ## Alternative Interfaces It was also considered to expose dynamic workgroup memory as extern static variables in Rust, like they are represented in LLVM IR. However, due to the pointer not being guaranteed to be dereferencable (that depends on the allocated size at runtime), such a global must be zero-sized, which makes global variables a bad fit. # Implementation Details Workgroup memory in amdgpu and nvptx lives in address space 3. Workgroup memory from a launch is implemented by creating an external global variable in address space 3. The global is declared with size 0, as the actual size is only known at runtime. It is defined behavior in LLVM to access an external global outside the defined size. There is no similar way to get the allocated size of launch-sized workgroup memory on amdgpu an nvptx, so users have to pass this out-of-band or rely on target specific ways for now.	2026-04-24 10:03:45 +02:00
sayantn	a5372be2a1	Add target arch verification for LLVM intrinsics	2026-04-12 23:33:27 +05:30
sayantn	c21f4ee437	Check for AutoUpgraded intrinsics, and lint on uses of deprecated intrinsics	2026-04-12 23:33:15 +05:30
Jonathan Brouwer	66a00ba2ef	Rollup merge of #153995 - Flakebi:gpu-use-convergent, r=nnethercote Use convergent attribute to funcs for GPU targets On targets with convergent operations, we need to add the convergent attribute to all functions that run convergent operations. Following clang, we can conservatively apply the attribute to all functions when compiling for such a target and rely on LLVM optimizing away the attribute in cases where it is not necessary. This affects the amdgpu and nvptx targets. cc @kjetilkjeka, @kulst for nvptx cc @ZuseZ4 r? @nnethercote, as you already reviewed this in the other PR Split out from rust-lang/rust#149637, the part here should be uncontroversial.	2026-04-08 14:21:57 +02:00
David Wood	a24ee0329e	cg_llvm/debuginfo: scalable vectors Generate debuginfo for scalable vectors, following the structure that Clang generates for scalable vectors.	2026-04-03 10:37:42 +00:00
Jakub Beránek	8b44562bc8	Revert "Rollup merge of #154200 - resrever:enable-dwarf-call-sites, r=dingxiangfei2009" This reverts commit `2f1603077b`, reversing changes made to `6e3c17424d`.	2026-03-27 20:08:24 +01:00
Scott Young	9677d7a587	debuginfo: emit DW_TAG_call_site entries	2026-03-22 08:42:21 -04:00
Flakebi	8e932ed79c	Use convergent attribute to funcs for GPU targets On targets with convergent operations, we need to add the convergent attribute to all functions that run convergent operations. Following clang, we can conservatively apply the attribute to all functions when compiling for such a target and rely on LLVM optimizing away the attribute in cases where it is not necessary. This affects the amdgpu and nvptx targets.	2026-03-17 10:51:31 +01:00
Ralf Jung	c7220f423b	rename min/maxnum intrinsics to min/maximum_number and fix their LLVM lowering	2026-03-15 14:53:00 +01:00
Josh Stone	52dfa94cdc	Update the minimum external LLVM to 21	2026-03-12 16:45:42 -07:00
bjorn3	a086b3617e	Remove ModuleBuffer ThinBuffer duplication	2026-02-21 11:47:45 +00:00
bjorn3	8b2c10ff82	Replace LLVMRustModuleBuffer with generic LLVMRustBuffer	2026-02-21 11:47:45 +00:00
Matthew Maurer	b639b0a4d8	llvm: Tolerate dead_on_return attribute changes The attribute now has a size parameter and sorts differently: * Explicitly omit size parameter during construction on 23+ * Tolerate alternate sorting in tests https://github.com/llvm/llvm-project/pull/171712	2026-01-21 23:39:03 +00:00
Nikita Popov	0be66603ac	Avoid passing addrspacecast to lifetime intrinsics Since LLVM 22 the alloca must be passed directly. Do this by stripping the addrspacecast if it exists.	2026-01-20 14:47:04 +01:00
Marcelo Domínguez	307a4fcdf8	Add scalar support for both host and device	2026-01-19 22:28:42 +01:00
Jonathan Brouwer	d898dccc21	Rollup merge of #150511 - Sa4dUs:offload-inline, r=ZuseZ4 Allow inline calls to offload intrinsic Removes explicit insertion point handling and recovers the pointer at the end of the saved basic block. r? `@ZuseZ4` fixes: https://github.com/rust-lang/rust/issues/150413	2025-12-31 14:30:48 +01:00
Marcelo Domínguez	9d8b4cc70d	Restore builder at the end of saved bb	2025-12-31 13:10:29 +01:00
dianqk	fe075ad212	Removes the serde dependency in rustc_codegen_llvm	2025-12-28 15:52:20 +08:00
Manuel Drehwald	dfef2e96fe	Remove the need to call clang for std::offload usages	2025-12-23 05:20:07 -08:00
sgasho	ddd5aad8a3	feat: dlopen Enzyme	2025-12-16 00:31:32 +09:00
Stuart Cook	2b150f2c65	Rollup merge of #147936 - Sa4dUs:offload-intrinsic, r=ZuseZ4 Offload intrinsic This PR implements the minimal mechanisms required to run a small subset of arbitrary offload kernels without relying on hardcoded names or metadata. - `offload(kernel, (..args))`: an intrinsic that generates the necessary host-side LLVM-IR code. - `rustc_offload_kernel`: a builtin attribute that marks device kernels to be handled appropriately. Example usage (pseudocode): ```rust fn kernel(x: mut [f64; 128]) { core::intrinsics::offload(kernel_1, (x,)) } #[cfg(target_os = "linux")] extern "C" { pub fn kernel_1(array_b: mut [f64; 128]); } #[cfg(not(target_os = "linux"))] #[rustc_offload_kernel] extern "gpu-kernel" fn kernel_1(x: mut [f64; 128]) { unsafe { (x)[0] = 21.0 }; } ```	2025-11-26 23:32:03 +11:00
Marcelo Domínguez	5128ce10a0	Implement offload intrinsic	2025-11-25 20:04:27 +01:00
Manuel Drehwald	5fbe5dae42	Only try to link against offload functions if llvm.enzyme is enabled	2025-11-23 00:19:53 -08:00
Manuel Drehwald	89d50591c0	Replace the first of 4 binary invocations for offload	2025-11-21 02:41:17 -08:00
Quinn Okabayashi	c7e50d0f37	Remove unused LLVMModuleRef argument	2025-11-12 15:46:08 +00:00
bors	87f9dcd5e2	Auto merge of #147935 - luca3s:add-rtsan, r=petrochenkov Add LLVM realtime sanitizer This is a new attempt at adding the [LLVM real-time sanitizer](https://clang.llvm.org/docs/RealtimeSanitizer.html) to rust. Previously this was attempted in https://github.com/rust-lang/rfcs/pull/3766. Since then the `sanitize` attribute was introduced in https://github.com/rust-lang/rust/pull/142681 and it is a lot more flexible than the old `no_santize` attribute. This allows adding real-time sanitizer without the need for a new attribute, like it was proposed in the RFC. Because i only add a new value to a existing command line flag and to a attribute i don't think an MCP is necessary. Currently real-time santizer is usable in rust code with the [rtsan-standalone](https://crates.io/crates/rtsan-standalone) crate. This downloads or builds the sanitizer runtime and then links it into the rust binary. The first commit adds support for more detailed sanitizer information. The second commit then actually adds real-time sanitizer. The third adds a warning against using real-time sanitizer with async functions, cloures and blocks because it doesn't behave as expected when used with async functions. I am not sure if this is actually wanted, so i kept it in a seperate commit. The fourth commit adds the documentation for real-time sanitizer.	2025-11-08 12:24:15 +00:00
Lucas Baumann	d198633b95	add realtime sanitizer	2025-11-06 13:20:12 +01:00
Manuel Drehwald	360b38cceb	Fix device code generation, to account for an implicit dyn_ptr argument.	2025-11-06 03:34:38 -05:00
Matthias Krüger	3d671c0d54	Rollup merge of #148103 - Zalathar:compression, r=wesleywiser cg_llvm: Pass `debuginfo_compression` through FFI as an enum There are only three possible values, making an enum more appropriate. This avoids string allocation on the Rust side, and avoids ad-hoc `!strcmp` to convert back to an enum on the C++ side.	2025-10-31 18:41:51 +01:00
Tomasz Miąsko	2a03a948b9	Deduce captures(none) for a return place and parameters Extend attribute deduction to determine whether parameters using indirect pass mode might have their address captured. Similarly to the deduction of `readonly` attribute this information facilitates memcpy optimizations.	2025-10-25 22:53:52 +02:00
Zalathar	73b734bf63	Pass `debuginfo_compression` through FFI as an enum	2025-10-25 23:58:19 +11:00
Guillaume Gomez	3938f42bb1	Rollup merge of #147608 - Zalathar:debuginfo, r=nnethercote cg_llvm: Use `LLVMDIBuilderCreateGlobalVariableExpression` - Part of rust-lang/rust#134001 - Follow-up to rust-lang/rust#146763 --- This PR dismantles the somewhat complicated `LLVMRustDIBuilderCreateStaticVariable` function, and replaces it with equivalent calls to `LLVMDIBuilderCreateGlobalVariableExpression` and `LLVMGlobalSetMetadata`. A key difference is that the new code does not replicate the attempted downcast of `InitVal`. As far as I can tell, those downcasts were actually dead, because `llvm::ConstantInt` and `llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`. I tried replacing those code paths with fatal errors, and was unable to induce failure in any of the relevant test suites I ran. I have also confirmed that if the calls to `create_static_variable` are commented out, debuginfo tests will fail, demonstrating some amount of relevant test coverage. The new `DIBuilder` methods have been added via an extension trait, not as inherent methods, to avoid impeding rust-lang/rust#142897.	2025-10-13 11:25:23 +02:00
Zalathar	1081d98551	Use `LLVMDIBuilderCreateGlobalVariableExpression` Note that the code in `LLVMRustDIBuilderCreateStaticVariable` that tried to downcast `InitVal` appears to have been dead, because `llvm::ConstantInt` and `llvm::ConstantFP` are not subclasses of `llvm::GlobalVariable`.	2025-10-12 23:36:26 +11:00
AMS21	0abecda9ed	Replace `LLVMRustContextCreate` with normal LLVM-C API calls Since `LLVMRustContextCreate` can easily be replaced with a call to `LLVMContextCreate` and `LLVMContextSetDiscardValueNames`.	2025-10-10 15:45:40 +02:00
bors	4b57d8154a	Auto merge of #147519 - Zalathar:rollup-o5f16uo, r=Zalathar Rollup of 3 pull requests Successful merges: - rust-lang/rust#147446 (PassWrapper: use non-deprecated lookupTarget method) - rust-lang/rust#147473 (Do `x check` on various bootstrap tools in CI) - rust-lang/rust#147509 (remove intrinsic wrapper functions from LLVM bindings) r? `@ghost` `@rustbot` modify labels: rollup	2025-10-09 10:54:43 +00:00
Stuart Cook	4dfd977c8b	Rollup merge of #147488 - AMS21:remove_llvm_rust_insert_private_global, r=nikic refactor: Remove `LLVMRustInsertPrivateGlobal` and `define_private_global` Since it can easily be implemented using the existing LLVM C API in terms of `LLVMAddGlobal` and `LLVMSetLinkage` and `define_private_global` was only used in one place. Work towards https://github.com/rust-lang/rust/issues/46437	2025-10-09 18:43:26 +11:00
AMS21	064e3b8212	remove intrinsic wrapper functions from LLVM bindings	2025-10-09 09:26:44 +02:00
AMS21	036ab3a925	refactor: Remove `LLVMRustInsertPrivateGlobal` and `define_private_global` Since it can easily be implemented using the existing LLVM C API in terms of `LLVMAddGlobal` and `LLVMSetLinkage` and `define_private_global` was only used in one place.	2025-10-08 21:59:48 +02:00
AMS21	1aed495ed7	refactor: replace `LLVMRustAtomicLoad/Store` with LLVM built-in functions	2025-10-08 13:53:09 +02:00
dianqk	1bd89bd42e	codegen: Generate `dbg_value` for the ref statement	2025-10-02 14:55:51 +08:00
Zalathar	906bf49ade	Declare all "fixed" metadata kinds as `MetadataKindId`	2025-09-30 20:10:10 +10:00
Matthias Krüger	c29fb2e57e	Rollup merge of #144197 - KMJ-007:type-tree, r=ZuseZ4 TypeTree support in autodiff # TypeTrees for Autodiff ## What are TypeTrees? Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently. ## Structure ```rust TypeTree(Vec<Type>) Type { offset: isize, // byte offset (-1 = everywhere) size: usize, // size in bytes kind: Kind, // Float, Integer, Pointer, etc. child: TypeTree // nested structure } ``` ## Example: `fn compute(x: &f32, data: &[f32]) -> f32` Input 0: `x: &f32` ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) }]) ``` Input 1: `data: &[f32]` ```rust TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float, // -1 = all elements child: TypeTree::new() }]) }]) ``` Output: `f32` ```rust TypeTree(vec![Type { offset: -1, size: 4, kind: Float, child: TypeTree::new() }]) ``` ## Why Needed? - Enzyme can't deduce complex type layouts from LLVM IR - Prevents slow memory pattern analysis - Enables correct derivative computation for nested structures - Tells Enzyme which bytes are differentiable vs metadata ## What Enzyme Does With This Information: Without TypeTrees (current state): ```llvm ; Enzyme sees generic LLVM IR: define float ``@distance(ptr`` %p1, ptr %p2) { ; Has to guess what these pointers point to ; Slow analysis of all memory operations ; May miss optimization opportunities } ``` With TypeTrees (our implementation): ```llvm define "enzyme_type"="{[]:Float@float}" float ``@distance(`` ptr "enzyme_type"="{[]:Pointer}" %p1, ptr "enzyme_type"="{[]:Pointer}" %p2 ) { ; Enzyme knows exact type layout ; Can generate efficient derivative code directly } ``` # TypeTrees - Offset and -1 Explained ## Type Structure ```rust Type { offset: isize, // WHERE this type starts size: usize, // HOW BIG this type is kind: Kind, // WHAT KIND of data (Float, Int, Pointer) child: TypeTree // WHAT'S INSIDE (for pointers/containers) } ``` ## Offset Values ### Regular Offset (0, 4, 8, etc.) Specific byte position within a structure ```rust struct Point { x: f32, // offset 0, size 4 y: f32, // offset 4, size 4 id: i32, // offset 8, size 4 } ``` TypeTree for `&Point` (internal representation): ```rust TypeTree(vec![ Type { offset: 0, size: 4, kind: Float }, // x at byte 0 Type { offset: 4, size: 4, kind: Float }, // y at byte 4 Type { offset: 8, size: 4, kind: Integer } // id at byte 8 ]) ``` Generates LLVM: ```llvm "enzyme_type"="{[]:Float@float}" ``` ### Offset -1 (Special: "Everywhere") Means "this pattern repeats for ALL elements" #### Example 1: Array `[f32; 100]` ```rust TypeTree(vec![Type { offset: -1, // ALL positions size: 4, // each f32 is 4 bytes kind: Float, // every element is float }]) ``` Instead of listing 100 separate Types with offsets `0,4,8,12...396` #### Example 2: Slice `&[i32]` ```rust // Pointer to slice data TypeTree(vec![Type { offset: -1, size: 8, kind: Pointer, child: TypeTree(vec![Type { offset: -1, // ALL slice elements size: 4, // each i32 is 4 bytes kind: Integer }]) }]) ``` #### Example 3: Mixed Structure ```rust struct Container { header: i64, // offset 0 data: [f32; 1000], // offset 8, but elements use -1 } ``` ```rust TypeTree(vec![ Type { offset: 0, size: 8, kind: Integer }, // header Type { offset: 8, size: 4000, kind: Pointer, child: TypeTree(vec![Type { offset: -1, size: 4, kind: Float // ALL array elements }]) } ]) ```	2025-09-28 18:13:11 +02:00
Matthias Krüger	e8578c8808	Rollup merge of #146763 - Zalathar:di-builder, r=jdonszelmann cg_llvm: Replace some DIBuilder wrappers with LLVM-C API bindings (part 5) - Part of rust-lang/rust#134001 - Follow-up to rust-lang/rust#146673 --- This is another batch of LLVMDIBuilder binding migrations, replacing some our own LLVMRust bindings with bindings to upstream LLVM-C APIs. Some of these are a little more complex than most of the previous migrations, because they split one LLVMRust binding into multiple LLVM bindings, but nothing too fancy. This appears to be the last of the low-hanging fruit. As noted in https://github.com/rust-lang/rust/issues/134001#issuecomment-2524979268, the remaining bindings are difficult or impossible to migrate at present.	2025-09-28 09:15:23 +02:00
Josh Stone	fe440ec934	llvm: add a destructor to call releaseSerializer	2025-09-24 16:53:17 -07:00
Augie Fackler	42cf78f762	llvm: update remarks support on LLVM 22 LLVM change dfbd76bda01e removed separate remark support entirely, but it turns out we can just drop the parameter and everything appears to work fine. Fixes 146912 as far as I can tell (the test passes.) @rustbot label llvm-main	2025-09-23 13:25:04 -04:00
Folkert de Vries	3565b0699d	emit attribute for readonly non-pure inline assembly	2025-09-21 21:16:06 +02:00
Zalathar	741e1e2ec7	Remove unused `LLVMRustDIBuilder(Create\|Dispose)` These should have been removed earlier, when we switched to the corresponding LLVM-C bindings.	2025-09-20 12:48:48 +10:00
Zalathar	e39e5a0d15	Use `LLVMDIBuilderCreate(Auto\|Parameter)Variable`	2025-09-19 20:56:58 +10:00
Zalathar	9daa026cad	Use `LLVMDIBuilder(CreateExpression\|InsertDeclareRecordAtEnd)`	2025-09-19 17:15:32 +10:00
Karan Janthe	3ba5f19182	autodiff: typetree recursive depth query from enzyme with fallback Signed-off-by: Karan Janthe <karanjanthe@gmail.com>	2025-09-19 05:42:27 +00:00

1 2 3 4 5 ...

299 Commits