Commit Graph

3410 Commits

Author SHA1 Message Date
Nicholas Nethercote e7d28d3e9b Rename the HashStable* trait/derives as StableHash*.
Specifically:
- `HashStable` -> `StableHash` (trait)
- `HashStable` -> `StableHash` (derive)
- `HashStable_NoContext` -> `StableHash_NoContext` (derive)

Note: there are some names in `compiler/rustc_macros/src/hash_stable.rs`
that are still to be renamed, e.g. `HashStableMode`.

Part of MCP 983.
2026-05-02 10:16:40 +02:00
Nicholas Nethercote 851049cf58 Rename the hash_stable method as stable_hash.
Part of MCP 983.
2026-05-02 10:15:59 +02:00
Jonathan Brouwer 6d8e1250ab Rollup merge of #155995 - Mrmaxmeier:debuginfo-embed-external-source, r=oli-obk
-Zembed-source: also embed external source

Hi,

I've been using `-Zembed-source` for a while and noticed that some sources are not embedded when compiling in release mode. See the minimal reproducer below for missing embedded source code.

The current implementation only emits source from the `src` field in `SourceFile`, but for multi-codegen-unit scenarios the source might only be available via the `external_src` field.

I've updated the LLVM and Cranelift backends to fall back to `external_src` if `src` is not available.

Minimal reproducer:
```console
$ cat Cargo.toml
[package]
name = "repro"
version = "0.0.1"
edition = "2021"
[profile.release]
strip = "none"
debug = true
$ cat src/lib.rs
// marker comment
pub fn foo() -> u64 { 42 }
$ cat src/main.rs
fn main() {
    println!("{}", repro::foo());
}
$ RUSTFLAGS="-g -Zembed-source=yes -Zdwarf-version=5" cargo build -r
   Compiling repro v0.0.1 (/tmp/repro)
    Finished `release` profile [optimized + debuginfo] target(s) in 0.08s
$ # Note: Source for lib.rs is missing here:
$ llvm-dwarfdump --debug-line target/release/repro | grep "marker comment"
$ RUSTFLAGS="-g -Zembed-source=yes -Zdwarf-version=5" cargo +stage1 build -r
    Finished `release` profile [optimized + debuginfo] target(s) in 0.04s
$ llvm-dwarfdump --debug-line target/release/repro | grep "marker comment"
         source: "// marker comment\npub fn foo() -> u64 { 42 }\n"
```

Thanks!
2026-05-01 13:10:36 +02:00
Jacob Pratt 560cc23c95 Rollup merge of #156003 - bjorn3:lto_refactors17, r=lqd
Pass Session to optimize_and_codegen_fat_lto

This is necessary to fix incremental LTO in cg_gcc as well as to do some LTO refactorings I want to do. The actual fix for cg_gcc will be done on the cg_gcc repo to test it in CI.
2026-04-30 22:28:29 -04:00
Jacob Pratt fae8c6edd1 Rollup merge of #155974 - folkertdev:c-variadic-experimental-arch, r=joshtriplett
add `c_variadic_experimental_arch` feature

tracking issue: https://github.com/rust-lang/rust/issues/155973

Based on https://hackmd.io/pIbUgMQuQcGaibJcinOcEw#Stabilize-c-variadic-function-definitions-rust155697, we'll gate niche targets where we don't control the implementation of `va_arg`, the ABI is unclear, or in general where we're not confident stabilizing the implementation.
2026-04-30 22:28:28 -04:00
Jacob Pratt 0649c3f5d5 Rollup merge of #155249 - hoodmane:wasm-panic-in-cleanup-reland, r=bjorn3
Fix: On wasm targets, call `panic_in_cleanup` if panic occurs in cleanup

Relies on https://github.com/rust-lang/llvm-project/pull/194.
Reland of https://github.com/rust-lang/rust/pull/151771.

Previously this was not correctly implemented. Each funclet may need its own terminate block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which can have a terminate_block for each funclet. We key on the first basic block of the funclet -- in particular, this is the start block for the old case of the top level terminate function.

Rather than using a catchswitch/catchpad pair, I used a cleanuppad. The reason for the pair is to avoid catching foreign exceptions on MSVC. On wasm, it seems that the catchswitch/catchpad pair is optimized back into a single cleanuppad and a catch_all instruction is emitted which will catch foreign exceptions. Because the new logic is only used on wasm, it seemed better to take the simpler approach seeing as they do the same thing.

- [ ] Add test for https://github.com/rust-lang/rust/issues/153948
2026-04-30 22:28:24 -04:00
bjorn3 c5e38b741d Pass Session to optimize_and_codegen_fat_lto
This is necessary to fix incremental LTO in cg_gcc as well as to do some
LTO refactorings I want to do. The actual fix for cg_gcc will be done on
the cg_gcc repo to test it in CI.
2026-04-30 14:18:33 +00:00
Mrmaxmeier 84daca4fdb debuginfo: embed external source
This allows embedding source code even when it's only available via
previously emitted metadata files.

Without this, embedded source availability is inconsistent, especially
in release builds.
2026-04-30 12:35:13 +02:00
Folkert de Vries 41796ecf1c add c_variadic_experimental_arch feature 2026-04-30 00:47:30 +02:00
Ralf Jung ad7ddbc08e simd_reduce_min/max: remove float support 2026-04-29 13:48:45 +02:00
Shoyu Vanilla (Flint) 5d26634476 Rollup merge of #152443 - kjetilkjeka:nvptx_drop_support_old_hw_and_isa, r=ZuseZ4
NVPTX: Drop support for old architectures and old ISAs

This is the implementation of [this MCP](https://github.com/rust-lang/compiler-team/issues/965#issuecomment-3837320262)

I believe it was said that no FCP was needed, but if that is incorrect then the FCP is anyway scheduled to finish in 2 days so it can in any case be merged then.
2026-04-29 10:40:46 +09:00
Jonathan Brouwer c6912bf401 Rollup merge of #155692 - fneddy:fix_naked-dead-code-elimination, r=folkertdev
disable naked-dead-code-elimination test if no RET mnemonic is available

this test emit x86_64 specific ret asm instruction and should not be compiled on any other arch.
2026-04-28 20:24:33 +02:00
Hood Chatham b072d24e26 Fix: On wasm targets, call panic_in_cleanup if panic occurs in cleanup
Previously this was not correctly implemented. Each funclet may need its own terminate
block, so this changes the `terminate_block` into a `terminate_blocks` `IndexVec` which
can have a terminate_block for each funclet. We key on the first basic block of the
funclet -- in particular, this is the start block for the old case of the top level
terminate function.

Rather than using a catchswitch/catchpad pair, I used a cleanuppad. The reason for the
pair is to avoid catching foreign exceptions on MSVC. On wasm, it seems that the
catchswitch/catchpad pair is optimized back into a single cleanuppad and a catch_all
instruction is emitted which will catch foreign exceptions. Because the new logic is
only used on wasm, it seemed better to take the simpler approach seeing as they do the
same thing.
2026-04-28 09:56:22 -07:00
Eddy (Eduard) Stefes 2a8e588c90 Add --print=backend-has-mnemonic and needs-asm-mnemonic directive
Add infrastructure to query LLVM backend for specific assembly mnemonics
and use it in compiletest to conditionally run tests based on instruction
availability.

This fixes test failures with naked-dead-code-elimination which requires
the `RET` mnemonic.

Co-authored-by: Folkert de Vries <flokkievids@gmail.com>
2026-04-28 10:21:15 +02:00
Kjetil Kjeka a2d23cec5f NVPTX: Drop support for old hw and old ISAs 2026-04-28 08:48:22 +02:00
Jonathan Brouwer dde4886801 Rollup merge of #146181 - Flakebi:dynamic-shared-memory, r=ZuseZ4,Sa4dus,workingjubilee,RalfJung,nikic,kjetilkjeka,kulst
Add intrinsic for launch-sized workgroup memory on GPUs

Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.

# Interface

With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.

It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).

All calls to this intrinsic return a pointer to the same address.

See the intrinsic documentation for more details.

## Alternative Interfaces

It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.

# Implementation Details

Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.

There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.

Tracking issue: rust-lang/rust#135516
2026-04-25 23:07:48 +02:00
Flakebi 13ec3de673 Add intrinsic for launch-sized workgroup memory on GPUs
Workgroup memory is a memory region that is shared between all
threads in a workgroup on GPUs. Workgroup memory can be allocated
statically or after compilation, when launching a gpu-kernel.
The intrinsic added here returns the pointer to the memory that is
allocated at launch-time.

# Interface

With this change, workgroup memory can be accessed in Rust by
calling the new `gpu_launch_sized_workgroup_mem<T>() -> *mut T`
intrinsic.

It returns the pointer to workgroup memory guaranteeing that it is
aligned to at least the alignment of `T`.
The pointer is dereferencable for the size specified when launching the
current gpu-kernel (which may be the size of `T` but can also be larger
or smaller or zero).

All calls to this intrinsic return a pointer to the same address.

See the intrinsic documentation for more details.

## Alternative Interfaces

It was also considered to expose dynamic workgroup memory as extern
static variables in Rust, like they are represented in LLVM IR.
However, due to the pointer not being guaranteed to be dereferencable
(that depends on the allocated size at runtime), such a global must be
zero-sized, which makes global variables a bad fit.

# Implementation Details

Workgroup memory in amdgpu and nvptx lives in address space 3.
Workgroup memory from a launch is implemented by creating an
external global variable in address space 3. The global is declared with
size 0, as the actual size is only known at runtime. It is defined
behavior in LLVM to access an external global outside the defined size.

There is no similar way to get the allocated size of launch-sized
workgroup memory on amdgpu an nvptx, so users have to pass this
out-of-band or rely on target specific ways for now.
2026-04-24 10:03:45 +02:00
Folkert de Vries 797059769e c-variadic: fix for sparc64
validated versus https://godbolt.org/z/qrM37rY6n
2026-04-23 14:46:44 +02:00
Folkert de Vries e9ab558406 va_arg: use emit_ptr_va_arg for mips 2026-04-23 01:20:59 +02:00
Jonathan Brouwer e9972d9dec Rollup merge of #155622 - folkertdev:va-arg-llvm-fixes, r=tgross35
c-variadic: `va_arg` fixes

tracking issue: https://github.com/rust-lang/rust/issues/44930

extracts some generic LLVM `va_arg` fixes from https://github.com/rust-lang/rust/pull/152576 and https://github.com/rust-lang/rust/pull/155429.

r? tgross35
2026-04-22 19:18:30 +02:00
Folkert de Vries 5149a31515 change llvm.ptrmask argument to isize
we were saying that the type is i32, but would often provide an i64.
That never failed so far, but starts failing (like, crashing LLVM) when
working with 128-bit values that are 16-byte aligned. So, we may as well
use the more robust approach now.
2026-04-22 10:32:27 +02:00
Folkert de Vries 45f98b9aa7 va_arg: remove unused argument 2026-04-22 00:07:01 +02:00
Folkert de Vries 0169aaea31 va_arg: fix potential misaligned load 2026-04-22 00:06:54 +02:00
Jonathan Brouwer 8ffe107470 Rollup merge of #155036 - bjorn3:lto_refactors16, r=TaKO8Ki
Store a PathBuf rather than SerializedModule for cached modules

In cg_gcc `ModuleBuffer` already only contains a path anyway. And for moving LTO into `-Zlink-only` we will need to serialize `MaybeLtoModules`. By storing a path cached modules we avoid writing them to the disk a second time during serialization of `MaybeLtoModules`.

Some further improvements will require changes to cg_gcc that I would prefer landing in the cg_gcc repo to actually test the LTO changes in CI.

Part of https://github.com/rust-lang/compiler-team/issues/908
2026-04-21 16:53:39 +02:00
Jacob Pratt 4c7e6565ef Rollup merge of #153411 - Sa4dUs:offload-slices, r=ZuseZ4
Offload slice support

This PR allows offload to support slice type arguments.

~NOTE: this is built on top of https://github.com/rust-lang/rust/pull/152283~

r? @ZuseZ4
2026-04-20 20:50:20 -04:00
Marcelo Domínguez af839a8a96 Add offload slice support 2026-04-20 19:15:52 +02:00
Jonathan Brouwer cccf996225 Rollup merge of #155264 - sayantn:amx-autocast, r=dianqk
Add autocast support for `x86amx`

Builds on rust-lang/rust#140763 by further adding autocasts for `x86amx` from/to vectors of size 8192 bits.

This also disables SIMD vector abi checks for the `"unadjusted"` abi because
 - This is primarily used to link with LLVM intrinsics, which don't actually lower to function calls with vector arguments. Even with other cg backends, this is true.
 - This ABI is internal and perma-unstable (and also super specific), so it is very unlikely that this will cause breakages.
 - (The primary reason) Without doing this we can't actually use 8192 bit long vectors to represent `x86amx`

> Why do we need a bypass for `x86amx`? Can't we use a `#[lang_item]` or something?

If `x86amx` was a normal LLVM type, this approach would've worked and I would also prefer it. But LLVM specifies that

> No instruction is allowed for this type. There are no arguments, arrays, pointers, vectors or constants of this type.

So we can't treat it like a normal type at all -- even if we add it like a lang-item, we would still have to special-case everywhere to check if we are passing to the correct LLVM intrinsic, and only then use the `x86amx` type. IMO this is needlessly complex, and way worse than this solution, which just adds it to the autocast list in cg_llvm

r? codegen
2026-04-20 08:14:10 +02:00
bors e22c616e4e Auto merge of #155083 - adwinwhite:introduce-unnormalized, r=lcnr
Introduce `Unnormalized` wrapper




This is the first step of the [eager normalization](https://rust-lang.zulipchat.com/#narrow/channel/364551-t-types.2Ftrait-system-refactor/topic/Eager.20normalization.2C.20ahoy.21/with/582996293) series.

This PR introduce an `Unnormalized` wrapper and make most normalization routines consume it. The purpose is to make normalization explicit. 
This PR contains no behavior change.

API changes are in the first two commit. 
There're some normalization routines left untouched:
- `normalize` in the type checker of borrowck: better do it together with `field.ty()` returning `Unnormalized`.
- `normalize_with_depth`: only used inside the old solver. Can be done later.
- `query_normalize`: rarely used.
- misc local normalization helpers.

The compiler errors are mostly fixed via `ast-grep`, with exceptions handled manually.
2026-04-19 19:18:17 +00:00
Adwin White 6279106e72 fix all errors 2026-04-20 00:18:28 +08:00
Folkert de Vries fa740c7255 refactor llvm va_arg intrinsic validation logic 2026-04-19 12:54:43 +02:00
bors 6f109d8a2d Auto merge of #155223 - teor2345:fndef-refactor, r=mati865
Refactor FnDecl and FnSig non-type fields into a new wrapper type





#### Why this Refactor?

This PR is part of an initial cleanup for the [arg splat experiment](https://github.com/rust-lang/rust/issues/153629), but it's a useful refactor by itself.

It refactors the non-type fields of `FnDecl`, `FnSig`, and `FnHeader` into a new packed wrapper types, based on this comment in the `splat` experiment PR:
https://github.com/rust-lang/rust/pull/153697#discussion_r3004637413

It also refactors some common `FnSig` creation settings into their own methods. I did this instead of creating a struct with defaults.

#### Relationship to `splat` Experiment

I don't think we can use functional struct updates (`..default()`) to create `FnDecl` and `FnSig`, because we need the bit-packing for the `splat` experiment.

Bit-packing will avoid breaking "type is small" assertions for commonly used types when `splat` is added.
This PR packs these types:
- ExternAbi: enum + `unwind` variants (38) -> 6 bits
- ImplicitSelfKind: enum variants (5) -> 3 bits
- lifetime_elision_allowed, safety, c_variadic: bool -> 1 bit

#### Minor Changes

Fixes some typos, and applies rustfmt to clippy files that got skipped somehow.
2026-04-18 23:46:37 +00:00
bors 27dbdb57a2 Auto merge of #155416 - Zalathar:rollup-D1EWnrR, r=Zalathar
Rollup of 19 pull requests

Successful merges:

 - rust-lang/rust#141633 (Suggest to bind `self.x` to `x` when field `x` may be in format string)
 - rust-lang/rust#152980 (c-variadic: fix implementation on `avr`)
 - rust-lang/rust#154491 (Extend `core::char`'s documentation of casing issues (and fix a rustdoc bug))
 - rust-lang/rust#155318 (Use mutable pointers for Unix path buffers)
 - rust-lang/rust#155335 (Bump bootstrap to 1.96 beta)
 - rust-lang/rust#155354 (Remove AttributeSafety from BUILTIN_ATTRIBUTES)
 - rust-lang/rust#154970 (rustdoc: preserve `doc(cfg)` on locally re-exported type aliases)
 - rust-lang/rust#155095 (changed the information provided by (mut x) to mut x (Fix 155030))
 - rust-lang/rust#155305 (Make `convert_while_ascii` unsafe)
 - rust-lang/rust#155358 (ImproperCTypes: Move erasing_region_normalisation into helper function)
 - rust-lang/rust#155377 (tests/debuginfo/basic-stepping.rs: Remove FIXME related to ZSTs)
 - rust-lang/rust#155383 (Rearrange `rustc_ast_pretty`)
 - rust-lang/rust#155384 (triagebot: notify on diagnostic attribute changes)
 - rust-lang/rust#155386 (Use `box_new` diagnostic item for Box::new suggestions)
 - rust-lang/rust#155391 (Small refactor of `QueryJob::latch` method)
 - rust-lang/rust#155395 (Tweak how the "copy path" rustdoc button works to allow some accessibility tool to work with rustdoc)
 - rust-lang/rust#155396 (`as_ref_unchecked` docs link fix)
 - rust-lang/rust#155411 (compiletest: Remove the `//@ should-ice` directive)
 - rust-lang/rust#155413 (fix: typo in `std::fs::hard_link` documentation)
2026-04-17 07:35:43 +00:00
Dominik Schwaiger da2bbfbbec add llvm writable attribute conditionally 2026-04-16 12:29:39 +00:00
Folkert de Vries b8ba4002f5 c-variadic: handle c_int being i16 and c_double being f32 on avr 2026-04-16 11:48:02 +02:00
teor dafb6bb801 Refactor FnDecl and FnSig flags into packed structs 2026-04-16 07:08:08 +10:00
Jacob Pratt 803a7227fb Rollup merge of #155005 - folkertdev:simd-element-type-llvm, r=nnethercote
preserve SIMD element type information

Preserve the SIMD element type and provide it to LLVM for better optimization.

This is relevant for AArch64 types like `int16x4x2_t`, see also https://github.com/llvm/llvm-project/issues/181514. Such types are defined like so:

```rust
#[repr(simd)]
struct int16x4_t([i16; 4]);

#[repr(C)]
struct int16x4x2_t(pub int16x4_t, pub int16x4_t);
```

Previously this would be translated to the opaque `[2 x <8 x i8>]`, with this PR it is instead `[2 x <4 x i16>]`. That change is not relevant for the ABI, but using the correct type prevents bitcasts that can (indeed, do) confuse the LLVM pattern matcher.

This change will make it possible to implement the deinterleaving loads on AArch64 in a portable way (without neon-specific intrinsics), which means that e.g. Miri or the cranelift backend can run them without additional support.

discussion at [#t-compiler > loss of vector element type information](https://rust-lang.zulipchat.com/#narrow/channel/131828-t-compiler/topic/loss.20of.20vector.20element.20type.20information/with/584483611)
2026-04-14 00:37:24 -04:00
sayantn 7e24cd823d Add autocast for x86_amx 2026-04-14 03:25:02 +05:30
bors 17584a1819 Auto merge of #155253 - JonathanBrouwer:rollup-lERdTAB, r=JonathanBrouwer
Rollup of 19 pull requests

Successful merges:

 - rust-lang/rust#155162 (relnotes for 1.95)
 - rust-lang/rust#140763 (Change codegen of LLVM intrinsics to be name-based, and add llvm linkage support for `bf16(xN)` and `i1xN`)
 - rust-lang/rust#153604 (Fix thread::available_parallelism on WASI targets with threads)
 - rust-lang/rust#154193 (Implement EII for statics)
 - rust-lang/rust#154389 (Add more robust handling of nested query cycles)
 - rust-lang/rust#154435 (resolve: Some import resolution cleanups)
 - rust-lang/rust#155236 (Normalize individual predicate of `InstantiatedPredicates` inside `predicates_for_generics`)
 - rust-lang/rust#155243 (cg_ssa: transmute between scalable vectors)
 - rust-lang/rust#153941 (tests/debuginfo/basic-stepping.rs: Explain why all lines are not steppable)
 - rust-lang/rust#154587 (Add --verbose-run-make-subprocess-output flag to suppress verbose run-make output for passing tests)
 - rust-lang/rust#154624 (Make `DerefPure` dyn-incompatible)
 - rust-lang/rust#154929 (Add `const Default` impls for `LazyCell` and `LazyLock`)
 - rust-lang/rust#154944 (Small refactor of `arena_cache` query values)
 - rust-lang/rust#155055 (UI automation)
 - rust-lang/rust#155062 (Move tests from `tests/ui/issues/` to appropriate directories)
 - rust-lang/rust#155131 (Stabilize feature `uint_bit_width`)
 - rust-lang/rust#155147 (Stabilize feature `int_lowest_highest_one`)
 - rust-lang/rust#155174 (Improve emission of `UnknownDiagnosticAttribute` lint)
 - rust-lang/rust#155194 (Fix manpage version replacement and use verbose version)
2026-04-13 18:32:47 +00:00
Jonathan Brouwer d2fa85cd4d Rollup merge of #154193 - JonathanBrouwer:external-static, r=jdonszelmann
Implement EII for statics

This PR implements EII for statics. I've tried to mirror the implementation for functions in a few places, this causes some duplicate code but I'm also not really sure whether there's a clean way to merge the implementations.

This does not implement defaults for static EIIs yet, I will do that in a followup PR
2026-04-13 20:19:56 +02:00
Jonathan Brouwer 2f607ee662 Rollup merge of #153997 - nnethercote:closure-consistency, r=petrochenkov
Use closures more consistently in `dep_graph.rs`.

This file has several methods that take a `FnOnce() -> R` closure:
- `DepGraph::with_ignore`
- `DepGraph::with_query_deserialization`
- `DepGraph::with_anon_task`
- `DepGraphData::with_anon_task_inner`

It also has two methods that take a faux closure via an `A` argument and a `fn(TyCtxt<'tcx>, A) -> R` argument:
- DepGraph::with_task
- DepGraphData::with_task

The rationale is that the faux closure exercises tight control over what state they have access to. This seems silly when (a) they are passed a `TyCtxt`, and (b) when similar nearby functions take real closures. And they are more awkward to use, e.g. requiring multiple arguments to be gathered into a tuple. This commit changes the faux closures to real closures.

r? @Zalathar
2026-04-13 14:02:35 +02:00
Folkert de Vries 6f428df8df preseve SIMD element type information
and provide it to LLVM for better optimization
2026-04-13 13:26:50 +02:00
David Wood 62ffc89914 cg_llvm: scalable vectors with simd_select
Building on the previous change, support scalable vectors with
`simd_select`. Previous patches already landed the necessary changes in
the implementation of this intrinsic, but didn't allow scalable vector
arguments to be passed in.
2026-04-13 01:57:55 +00:00
David Wood ca6a85155b cg_llvm: replace sve_cast with simd_cast
Previously `sve_cast`'s implementation was abstracted to power both
`sve_cast` and `simd_cast` which supported scalable and non-scalable
vectors respectively. In anticipation of having to do this for another
`simd_*` intrinsic, `sve_cast` is removed and `simd_cast` is changed to
accept both scalable and non-scalable intrinsics, an approach that will
scale better to the other intrinsics.
2026-04-13 01:57:55 +00:00
sayantn 11f350da38 Add autocast for bf16 and bf16xN 2026-04-12 23:33:27 +05:30
sayantn 5aa800af80 Add autocast for i1 vectors 2026-04-12 23:33:27 +05:30
sayantn 3d89a5be50 Add autocasts for structs 2026-04-12 23:33:27 +05:30
sayantn a5372be2a1 Add target arch verification for LLVM intrinsics 2026-04-12 23:33:27 +05:30
sayantn c21f4ee437 Check for AutoUpgraded intrinsics, and lint on uses of deprecated intrinsics 2026-04-12 23:33:15 +05:30
sayantn aa1a5f8a93 Codegen non-overloaded LLVM intrinsics using their name 2026-04-12 19:00:17 +05:30
Jonathan Brouwer c7e194f215 Fix codegen for add_static_aliases 2026-04-11 10:17:44 +02:00