Commit Graph

66 Commits

Author SHA1 Message Date
bors d1c79458b5 Auto merge of #153379 - TKanX:refactor/149164-simplify-autodiff-rlib, r=ZuseZ4
refactor(autodiff): Simplify Autodiff Handling of `rlib` Dependencies

### Summary:

Resolves the two FIXMEs left in rust-lang/rust#149033, per @bjorn3 guidance in [the discussion](https://github.com/rust-lang/rust/pull/149033#discussion_r2535465880).

Closes rust-lang/rust#149164 

r? @ZuseZ4
cc @bjorn3
2026-03-11 02:03:25 +00:00
Jonathan Brouwer 7595e5b80d Rollup merge of #152283 - Sa4dUs:offload-handle-alloca, r=ZuseZ4
Properly pass offload sizes to kernel args

This PRs prevents offload from creating an unnecessary alloca when all the arg sizes are static.
I'll implement the first dynamic-size data type in a follow up PR (slice support).

r? @ZuseZ4
2026-03-05 06:31:37 +01:00
Tony Kan dd9922151f refactor(autodiff): Simplify rlib dep handling; use fn_ptr_ty in adjust_activity_to_abi, drop mono-collection & cross-crate-inline workarounds 2026-03-04 03:53:52 -08:00
Marcelo Domínguez abb86d6df4 Avoid alloca for fully static sizes 2026-03-03 11:52:01 +01:00
Jonathan Brouwer 90c93ab7c1 Move rustc_ast::AutoDiffAttrs to rustc_hir::RustcAutodiff 2026-02-26 09:41:21 +01:00
Manuel Drehwald 6de0591c0b Split ol mapper into more specific to/kernel/from mapper and move init_all_rtls into global ctor 2026-02-07 17:34:39 -08:00
Marcelo Domínguez 212c8c3811 Remove dummy loads 2026-02-04 15:26:56 +01:00
Manuel Drehwald 1f11bf6649 Leave note to drop tgt_init_all_rtls in the future 2026-01-27 10:43:22 -08:00
Manuel Drehwald 7eae36f017 Add an early return if handling multiple offload calls 2026-01-27 10:43:03 -08:00
Manuel Drehwald 43111396e3 move initialization of omp/ol runtimes into global_ctor/dtor 2026-01-20 20:06:08 -05:00
Stuart Cook 1262ff906b Rollup merge of #150288 - offload-bench-fix, r=ZuseZ4
Add scalar support for offload

This PR adds scalar support to the offload feature. The scalar management has two main parts:

On the host side, each scalar arg is casted to `ix` type, zero extended to `i64` and passed to the kernel like that.
On the device, the each scalar arg (`i64` at that point), is truncated to `ix` and then casted to the original type.

r? @ZuseZ4
2026-01-20 18:00:08 +11:00
Marcelo Domínguez 307a4fcdf8 Add scalar support for both host and device 2026-01-19 22:28:42 +01:00
Manuel Drehwald 5c85d522d0 Generate global openmp metadata to trigger llvm openmp-opt pass 2026-01-16 14:57:32 -05:00
Marcelo Domínguez bc751adcdb Minor doc and ty fixes 2026-01-14 11:37:31 +01:00
Manuel Drehwald fa584faca5 Update test and verify that tgt_(un)register_lib have the right type 2026-01-04 06:58:31 -08:00
Marcelo Domínguez 58e2610f71 Expose workgroup/thread dims as intrinsic args 2026-01-02 11:50:32 +01:00
Marcelo Domínguez 9d8b4cc70d Restore builder at the end of saved bb 2025-12-31 13:10:29 +01:00
Marcelo Domínguez 04c2d2be13 Remove region_id unnamed attr 2025-12-19 13:27:14 +01:00
Marcelo Domínguez 3e4944d573 Split runtime global logic and cache kernel specific one 2025-12-19 13:27:13 +01:00
Marcelo Domínguez 5128ce10a0 Implement offload intrinsic 2025-11-25 20:04:27 +01:00
Manuel Drehwald 360b38cceb Fix device code generation, to account for an implicit dyn_ptr argument. 2025-11-06 03:34:38 -05:00
bors fd847d4d5d Auto merge of #142696 - ZuseZ4:offload-device1, r=oli-obk
Offload host2

r? `@oli-obk`

A follow-up to my previous gpu host PR. With this, I can (in theory) run a sufficiently simple Rust function on GPUs. I tested it on AMD, where the amdgcn tartget of rustc causes issues due to Addressspace castings, which might not be valid. If I (manually) fix them, I can run the generated IR on an AMD GPU. This should conceptually also work on NVIDIA or Intel. I updated the dev-guide acordingly: https://rustc-dev-guide.rust-lang.org/offload/usage.html

I am unhappy with the amount of standalone functions in my offload code, so in my second commit I bundled some of the code around two structs which are Rust versions of the LLVM/Offload structs which they represent. The structs themselves only have doc comments. Since I directly lower everything to llvm-ir I didn't saw a big value in modelling the struct member variables.
2025-10-20 10:17:29 +00:00
Manuel Drehwald 5bb815a705 model offload C++ structs through Rust structs 2025-10-19 09:38:46 -07:00
Manuel Drehwald b56d555a36 fix host code 2025-10-19 09:28:39 -07:00
Manuel Drehwald 52e7917586 Use globals instead of metadata, since metadata isn't emitted in debug builds 2025-10-07 20:13:59 -04:00
Manuel Drehwald dcc36a8642 add incremental/debug test for autodiff 2025-10-07 20:13:56 -04:00
Zalathar 69a975faa9 Consistently import llvm::Type and llvm::Value 2025-10-06 13:09:16 +11:00
Manuel Drehwald ddbaca521e fix void and empty struct ret 2025-09-30 22:47:40 -04:00
Karan Janthe 375e14ef49 Add TypeTree metadata attachment for autodiff
- Add F128 support to TypeTree Kind enum
  - Implement TypeTree FFI bindings and conversion functions
  - Add typetree.rs module for metadata attachment to LLVM functions
  - Integrate TypeTree generation with autodiff intrinsic pipeline
  - Support scalar types: f32, f64, integers, f16, f128
  - Attach enzyme_type attributes as LLVM string metadata for Enzyme

Signed-off-by: Karan Janthe <karanjanthe@gmail.com>
2025-09-19 04:02:19 +00:00
bors 97a987f14c Auto merge of #142544 - Sa4dUs:prevent-abi-changes, r=ZuseZ4
Prevent ABI changes affect EnzymeAD

This PR handles ABI changes for autodiff input arguments to improve Enzyme compatibility. Fundamentally this adjusts activities when a function argument is lowered as an `ScalarPair`, so there's no mismatch between diff activities and args. Also removes activities corresponding to ZSTs.

fixes: https://github.com/rust-lang/rust/issues/144025

r? `@ZuseZ4`
2025-09-18 07:32:49 +00:00
Marcelo Domínguez e04567c363 Check ZST via PassMode 2025-09-17 13:58:17 +00:00
Marcelo Domínguez 0bf85d35ec Support ZST args 2025-09-17 12:11:27 +00:00
Marcelo Domínguez 8dbd1b014a doc and move single branch match to an if let 2025-09-17 12:01:22 +00:00
Marcelo Domínguez 466bec9029 Adjust autodiff actitivies for ScalarPair 2025-09-17 12:01:22 +00:00
Marijn Schouten 05659c99c9 gpu offload: change suspicious map into filter 2025-09-05 11:39:17 +00:00
Zalathar b4e97e5d86 Rename llvm::Bool aliases to standard const case
This avoids the need for `#![allow(non_upper_case_globals)]`.
2025-08-24 23:09:54 +10:00
Marcelo Domínguez c9c1c17128 Remove inlining for autodiff handling 2025-08-14 16:30:16 +00:00
Marcelo Domínguez 250d77e5d7 Complete functionality and general cleanup 2025-08-14 16:30:15 +00:00
Marcelo Domínguez 5c631041aa Basic implementation of autodiff intrinsic 2025-08-14 16:29:58 +00:00
Manuel Drehwald 4a1a5a4295 gpu host code generation 2025-07-18 16:30:42 -07:00
León Orell Valerian Liehr be5f8f299d Rollup merge of #143388 - bjorn3:lto_refactors, r=compiler-errors
Various refactors to the LTO handling code

In particular reducing the sharing of code paths between fat and thin-LTO and making the fat LTO implementation more self-contained. This also moves some autodiff handling out of cg_ssa into cg_llvm given that Enzyme only works with LLVM anyway and an implementation for another backend may do things entirely differently. This will also make it a bit easier to split LTO handling out of the coordinator thread main loop into a separate loop, which should reduce the complexity of the coordinator thread.
2025-07-17 03:58:28 +02:00
Oli Scherer 7f95f04267 Eliminate all direct uses of LLVMMDStringInContext2 2025-07-14 08:27:08 +00:00
Oli Scherer b9baf63f99 Merge typeid_metadata and create_metadata 2025-07-14 08:27:08 +00:00
Oli Scherer 84eeca2e2f Make some "safe" llvm ops actually sound 2025-07-10 07:27:41 +00:00
bjorn3 8d63c7a1d6 Remove unused config param from WriteBackendMethods::autodiff 2025-07-03 16:13:25 +00:00
Manuel Drehwald 6359123d25 add and use generic get_const_int function 2025-06-16 14:23:06 -07:00
bit-aloo 9bc04016e6 add custom enzyme markers to target methods 2025-04-25 11:09:52 +05:30
Matthias Krüger c3f811f02f Rollup merge of #139700 - EnzymeAD:autodiff-flags, r=oli-obk
Autodiff flags

Interestingly, it seems that some other projects have conflicts with exactly the same LLVM optimization passes as autodiff.
At least `LLVMRustOptimize` has exactly the flags that we need to disable problematic opt passes.

This PR enables us to compile code where users differentiate two identical functions in the same module. This has been especially common in test cases, but it's not impossible to encounter in the wild.

It also enables two new flags for testing/debugging. I consider writing an MCP to upgrade PrintPasses to be a standalone -Z flag, since it is *not* the same as `-Z print-llvm-passes`, which IMHO gives less useful output. A discussion can be found here: [#t-compiler/llvm > Print llvm passes. @ 💬](https://rust-lang.zulipchat.com/#narrow/channel/187780-t-compiler.2Fllvm/topic/Print.20llvm.20passes.2E/near/511533038)

Finally, it improves `PrintModBefore` and `PrintModAfter`. They used to work reliable, but now we just schedule enzyme as part of an existing ModulePassManager (MPM). Since Enzyme is last in the MPM scheduling, PrintModBefore became very inaccurate. It used to print the input module, which we gave to the Enzyme and was great to create llvm-ir reproducer. However, lately the MPM would run the whole `default<O3>` pipeline, which heavily modifies the llvm module, before we pass it to Enzyme. That made it impossible to use the flag to create llvm-ir reproducers for Enzyme bugs. We now schedule a PrintModule pass just before Enzyme, solving this problem.

Based on the PrintPass output, it also _seems_ like changing `registerEnzymeAndPassPipeline(PB, true);` to `registerEnzymeAndPassPipeline(PB, false);` has no effect. In theory, the bool should tell Enzyme to schedule some helpful passes in the PassBuilder. However, since it doesn't do anything and I'm not 100% sure anymore on whether we really need it, I'll just disable it for now and postpone investigations.

r? ``@oli-obk``

closes #139471

Tracking:

- https://github.com/rust-lang/rust/issues/124509
2025-04-24 17:19:44 +02:00
Manuel Drehwald a68ae0cbc1 working dupv and dupvonly for fwd mode 2025-04-16 17:13:31 -04:00
Manuel Drehwald 5ea9125f37 update documentation 2025-04-12 01:36:47 -04:00