Commit Graph

361 Commits

Author SHA1 Message Date
beetrees 3aac9a37a5 Remove LLVM 18 inline ASM span fallback 2025-04-06 02:31:52 +01:00
Stuart Cook c6bf3a01ef Rollup merge of #137880 - EnzymeAD:autodiff-batching, r=oli-obk
Autodiff batching

Enzyme supports batching, which is especially known from the ML side when training neural networks.
There we would normally have a training loop, where in each iteration we would pass in some data (e.g. an image), and a target vector. Based on how close we are with our prediction we compute our loss, and then use backpropagation to compute the gradients and update our weights.
That's quite inefficient, so what you normally do is passing in a batch of 8/16/.. images and targets, and compute the gradients for those all at once, allowing better optimizations.

Enzyme supports batching in two ways, the first one (which I implemented here) just accepts a Batch size,
and then each Dual/Duplicated argument has not one, but N shadow arguments.  So instead of
```rs
for i in 0..100 {
   df(x[i], y[i], 1234);
}
```
You can now do
```rs
for i in 0..100.step_by(4) {
   df(x[i+0],x[i+1],x[i+2],x[i+3], y[i+0], y[i+1], y[i+2], y[i+3], 1234);
}
```
which will give the same results, but allows better compiler optimizations. See the testcase for details.

There is a second variant, where we can mark certain arguments and instead of having to pass in N shadow arguments, Enzyme assumes that the argument is N times longer. I.e. instead of accepting 4 slices with 12 floats each, we would accept one slice with 48 floats. I'll implement this over the next days.

I will also add more tests for both modes.

For any one preferring some more interactive explanation, here's a video of Tim's llvm dev talk, where he presents his work. https://www.youtube.com/watch?v=edvaLAL5RqU
I'll also add some other docs to the dev guide and user docs in another PR.

r? ghost

Tracking:

- https://github.com/rust-lang/rust/issues/124509
- https://github.com/rust-lang/rust/issues/135283
2025-04-05 13:18:13 +11:00
Manuel Drehwald 89d8948835 add new flag to print the module post-AD, before opts 2025-04-04 14:25:23 -04:00
Matthias Krüger 66e61c78e7 Rollup merge of #138949 - madsmtm:rename-to-darwin, r=WaffleLapkin
Rename `is_like_osx` to `is_like_darwin`

Replace `is_like_osx` with `is_like_darwin`, which more closely describes reality (OS X is the pre-2016 name for macOS, and is by now quite outdated; Darwin is the overall name for the OS underlying Apple's macOS, iOS, etc.).

``@rustbot`` label O-apple
r? compiler
2025-04-04 08:02:05 +02:00
Mads Marquart 328846c6eb Rename is_like_osx to is_like_darwin 2025-03-25 21:53:52 +01:00
Daniel Paoliello 79b9664091 Reduce visibility of most items in rustc_codegen_llvm 2025-03-25 16:36:47 +11:00
bors 0c72c0d11a Auto merge of #133250 - DianQK:embed-bitcode-pgo, r=nikic
The embedded bitcode should always be prepared for LTO/ThinLTO

Fixes #115344. Fixes #117220.

There are currently two methods for generating bitcode that used for LTO. One method involves using `-C linker-plugin-lto` to emit object files as bitcode, which is the typical setting used by cargo. The other method is through `-C embed-bitcode=yes`.

When using with `-C embed-bitcode=yes -C lto=no`, we run a complete non-LTO LLVM pipeline to obtain bitcode, then the bitcode is used for LTO. We run the Call Graph Profile Pass twice on the same module.

This PR is doing something similar to LLVM's `buildFatLTODefaultPipeline`, obtaining the bitcode for embedding after running `buildThinLTOPreLinkDefaultPipeline`.

r? nikic
2025-03-01 08:22:18 +00:00
许杰友 Jieyou Xu (Joe) 61e90040db Rollup merge of #137017 - bjorn3:ignore_invalid_bitcode, r=oli-obk
Don't error when adding a staticlib with bitcode files compiled by newer LLVM

cc https://github.com/rust-lang/rust/issues/128955#issuecomment-2657811196
2025-02-28 22:29:49 +08:00
bjorn3 9f190d764f Restore usage of io::Error 2025-02-26 13:45:35 +00:00
David Wood a5615d3c62 codegen_llvm: avoid Deref impls w/ extern type
`rustc_codegen_llvm` relied on `Deref` impls where `Deref::Target` was
or contained an extern type - in my experimental implementation of
rust-lang/rfcs#3729, this isn't possible as the `Target` associated
type's `?Sized` bound cannot be relaxed backwards compatibly (unless we
come up with some way of doing this).

In later pull requests with the rust-lang/rfcs#3729 implementation,
breakage like this could only occur for nightly users relying on the
`extern_types` feature.

Upstreaming this to avoid needing to keep carrying this patch locally,
and I think it'll necessarily need to change eventually.
2025-02-24 08:08:55 +00:00
DianQK da50297a6e Save pre-link bitcode to ModuleCodegen 2025-02-23 21:23:38 +08:00
DianQK 9431427cc3 Add new_regular and new_allocator to ModuleCodegen 2025-02-23 21:23:38 +08:00
DianQK 1a99ca8da9 The embedded bitcode should always be prepared for LTO/ThinLTO 2025-02-23 21:23:36 +08:00
Manuel Drehwald e2d250c3f6 update autodiff flags 2025-02-21 21:51:20 -05:00
Manuel Drehwald f4e2218b13 clean up autodiff code/comments 2025-02-21 21:47:48 -05:00
Oli Scherer ce7f58bd91 Merge two operations that were always performed together 2025-02-20 11:24:00 +00:00
bjorn3 736ef0a4ce Don't error when adding a staticlib with bitcode files compiled by newer LLVM 2025-02-14 10:54:21 +00:00
clubby789 2966256133 Make -O mean -C opt-level=3 2025-02-13 19:47:55 +00:00
Matthias Krüger 9e89feefb9 Rollup merge of #135549 - oli-obk:push-tmxtpnrloyqu, r=compiler-errors
Document some safety constraints and use more safe wrappers

Lots of unsafe codegen_llvm code has safe wrappers already, so I used some of them and added some where applicable. I stopped here because this diff is large enough and should probably be reviewed independently of other changes.
2025-02-12 06:07:35 +01:00
Oli Scherer dcf1e4d72b Document some safety constraints and use more safe wrappers 2025-02-11 09:47:13 +00:00
Oli Scherer 4b83038d63 Add a safe wrapper for WriteBitcodeToFile 2025-02-11 09:41:22 +00:00
Oli Scherer b2cd1b8ead Remove an unsafe closure invariant by inlining the closure wrapper into the called function 2025-02-11 09:41:22 +00:00
Jacob Pratt 6153a8dcea Rollup merge of #136721 - dpaoliello:cleanllvm2, r=Zalathar
cg_llvm: Reduce visibility of some items outside the `llvm` module

Next piece of #135502

This reduces the visibility of items (other than those in the `llvm` module) so that dead code analysis will correctly identify unused items.
2025-02-11 01:02:40 -05:00
Daniel Paoliello 5f29273921 rustc_codegen_llvm: Mark items as pub(crate) outside of the llvm module 2025-02-10 10:17:25 -08:00
Manuel Drehwald 1221cff551 move second opt run to lto phase and cleanup code 2025-02-10 01:35:22 -05:00
Manuel Drehwald 21d096184e fix non-enzyme builds 2025-02-07 22:27:46 -05:00
Ken Matsui 44e8c43976 rustc_codegen_llvm: remove outdated asm-to-obj codegen note
Remove comment about missing integrated assembler handling, which was
removed in commit 02840ca.
2025-01-22 17:58:50 -05:00
Matthias Krüger 448bad9eba Rollup merge of #133752 - klensy:cp, r=davidtwco
replace copypasted ModuleLlvm::parse

replaced code same as in https://github.com/rust-lang/rust/blob/bd36e69d2533ee750e2d805915b8ca88d2825e0f/compiler/rustc_codegen_llvm/src/lib.rs#L426-L445

except before error message was emitted via `write::llvm_err`, which returned other error kind, but it still ok?
2025-01-13 15:56:55 +01:00
Matthew Maurer fc32dd49cb llvm: Ignore error value that is always false
See llvm/llvm-project#121851

For LLVM 20+, this function (`renameModuleForThinLTO`) has no return
value. For prior versions of LLVM, this never failed, but had a
signature which allowed an error value people were handling.
2025-01-07 01:02:22 +00:00
Manuel Drehwald d753cbf779 upstream rustc_codegen_llvm changes for enzyme/autodiff 2025-01-01 21:42:45 +01:00
Ralf Jung a0dbb37ebd add llvm_floatabi field to target spec that controls FloatABIType 2024-12-30 21:59:05 +01:00
Ralf Jung fff026c8e5 rustc_llvm: expose FloatABIType target machine parameter 2024-12-30 18:10:59 +01:00
Ralf Jung 62bb35ab5d make -Csoft-float have an effect on all ARM targets 2024-12-29 11:10:36 +01:00
Nicholas Nethercote 2620eb42d7 Re-export more rustc_span::symbol things from rustc_span.
`rustc_span::symbol` defines some things that are re-exported from
`rustc_span`, such as `Symbol` and `sym`. But it doesn't re-export some
closely related things such as `Ident` and `kw`. So you can do `use
rustc_span::{Symbol, sym}` but you have to do `use
rustc_span::symbol::{Ident, kw}`, which is inconsistent for no good
reason.

This commit re-exports `Ident`, `kw`, and `MacroRulesNormalizedIdent`,
and changes many `rustc_span::symbol::` qualifiers in `compiler/` to
`rustc_span::`. This is a 200+ net line of code reduction, mostly
because many files with two `use rustc_span` items can be reduced to
one.
2024-12-18 13:38:53 +11:00
bors 903d2976fd Auto merge of #129181 - beetrees:asm-spans, r=pnkfelix,compiler-errors
Pass end position of span through inline ASM cookie

Before this PR, only the start position of the span was passed though the inline ASM cookie to diagnostics. LLVM 19 has full support for 64-bit inline ASM cookies; this PR uses that to pass the end position of the span in the upper 32 bits, meaning inline ASM diagnostics now point at the entire line the error occurred on, not just the first character of it.
2024-12-12 02:34:06 +00:00
Kornel eadea7764e Use c"lit" for CStrings without unwrap 2024-12-02 18:16:36 +00:00
klensy 694950d73c replace copypasted ModuleLlvm::parse 2024-12-02 16:02:46 +03:00
Nikita Popov d3ad000943 Respect verify-llvm-ir option in the backend
We are currently unconditionally verifying the LLVM IR in the
backend (twice), ignoring the value of the verify-llvm-ir option.
2024-11-26 15:26:03 +01:00
beetrees 68227a3777 Pass end position of span through inline ASM cookie 2024-11-26 13:00:08 +00:00
DianQK 3a23669787 embed-bitcode is no longer used in iOS 2024-11-24 15:51:47 +08:00
bjorn3 9e6d2da83d Reduce dependence on the target name
The target name can be anything with custom target specs. Matching on
fields inside the target spec is much more robust than matching on the
target name.
2024-11-03 18:29:01 +00:00
Noratrieb a26450cf81 Rename target triple to target tuple in many places in the compiler
This changes the naming to the new naming, used by `--print
target-tuple`.
It does not change all locations, but many.
2024-11-02 21:29:59 +01:00
Matthias Krüger bb544f863f Rollup merge of #131037 - madsmtm:move-llvm-target-versioning, r=petrochenkov
Move versioned Apple LLVM targets from `rustc_target` to `rustc_codegen_ssa`

Fully specified LLVM targets contain the OS version on macOS/iOS/tvOS/watchOS/visionOS, and this version depends on the deployment target environment variables like `MACOSX_DEPLOYMENT_TARGET`, `IPHONEOS_DEPLOYMENT_TARGET` etc.

We would like to move this to later in the compilation pipeline, both because it feels impure to access environment variables when fetching target information, but mostly because we need access to more information from https://github.com/rust-lang/rust/pull/130883 to do https://github.com/rust-lang/rust/issues/118204. See also https://github.com/rust-lang/rust/pull/129342#issuecomment-2335156119 for some discussion.

The first and second commit does the actual refactor, it should be a non-functional change, the third commit adds diagnostics for invalid deployment targets, which are now possible to do because we have access to the session.

Tested with the same commands as in https://github.com/rust-lang/rust/pull/130435.

r? ``````@petrochenkov``````
2024-11-02 08:33:10 +01:00
Mads Marquart e1233153ac Move versioned LLVM target creation to rustc_codegen_ssa
The OS version depends on the deployment target environment variables,
the access of which we want to move to later in the compilation pipeline
that has access to more information, for example `env_depinfo`.
2024-11-01 17:07:18 +01:00
Zalathar ce3e14a448 Remove support for -Zprofile (gcov-style coverage instrumentation) 2024-10-31 09:09:25 +11:00
Zalathar 65ff2a6ad7 Consistently use safe wrapper function set_section 2024-10-30 11:38:20 +11:00
Matthias Krüger 2707cd670c Rollup merge of #132319 - Zalathar:add-module-flag, r=jieyouxu
cg_llvm: Clean up FFI calls for setting module flags

This is a combination of several inter-related changes to how module flags are set:

- Remove some unnecessary code for setting an `"LTOPostLink"` flag, which has been obsolete since LLVM 17.
- Define our own enum instead of relying on enum values defined by LLVM's unstable C++ API.
- Use safe wrapper functions to set module flags, instead of direct `unsafe` calls.
- Consistently pass pointer/length strings instead of C strings.
- Remove or shrink some `unsafe` blocks.
2024-10-29 18:38:59 +01:00
Zalathar ba81dbf3c6 Don't set unnecessary module flag "LTOPostLink"
This module flag was an internal detail of LLVM's optimization passes, and all
code involving it was removed in LLVM 17.

<https://github.com/llvm/llvm-project/commit/200cc952a28a73687ba24d5334415df6332f2d5b>
2024-10-29 21:17:13 +11:00
Jubilee a70b90b822 Rollup merge of #132216 - klensy:c_uint, r=cuviper
correct LLVMRustCreateThinLTOData arg types

`LLVMRustCreateThinLTOData` defined in rust as
```rust
    pub fn LLVMRustCreateThinLTOData(
        Modules: *const ThinLTOModule,
        NumModules: c_uint,
        PreservedSymbols: *const *const c_char,
        PreservedSymbolsLen: c_uint,
    ) -> Option<&'static mut ThinLTOData>;
```
but in cpp as
```cpp
extern "C" LLVMRustThinLTOData *
LLVMRustCreateThinLTOData(LLVMRustThinLTOModule *modules, int num_modules,
                          const char **preserved_symbols, int num_symbols) {
```

(note `c_unit` vs `int` types). Let it be actually `size_t`.

Also fixes return type of `LLVMRustDIBuilderCreateOpLLVMFragment` to uint64_t as other similar functions around, which should be correct, i assume.
2024-10-29 03:11:42 -07:00
Jubilee 5d0f52efa4 Rollup merge of #131375 - klensy:clone_on_ref_ptr, r=cjgillot
compiler: apply clippy::clone_on_ref_ptr for CI

Apply lint https://rust-lang.github.io/rust-clippy/master/index.html#/clone_on_ref_ptr for compiler, also see https://github.com/rust-lang/rust/pull/131225#discussion_r1790109443.

Some Arc's can be misplaced with Lrc's, sorry.

https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/enable.20more.20clippy.20lints.20for.20compiler.20.28and.5Cor.20std.29
2024-10-29 03:11:39 -07:00