Rollup merge of #140763 - sayantn:test-amx, r=dianqk

Change codegen of LLVM intrinsics to be name-based, and add llvm linkage support for `bf16(xN)` and `i1xN`

*[View all comments](https://triagebot.infra.rust-lang.org/gh-comments/rust-lang/rust/pull/140763)*

This PR changes how LLVM intrinsics are codegen

# Explanation of the changes

## Current procedure

This is the same for all functions, LLVM intrinsics are _not_ treated specially
 - We get the LLVM Type of a function simply using the argument types. For example, the following function
   ```rust
   #[link_name = "llvm.sqrt.f32"]
   fn sqrtf32(a: f32) -> f32;
   ```
   will have LLVM type simply `f32 (f32)` due to the Rust signature

### Pros

 - Simpler to implement, no extra complexity involved due to LLVM intrinsics

### Cons

 - LLVM intrinsics have a well-defined signature, completely defined by their name (and if it is overloaded, the type parameters). So, this process of converting Rust signatures to LLVM signatures may not work, for example the following code generates LLVM IR without any problem
   ```rust
   #[link_name = "llvm.sqrt.f32"]
   fn sqrtf32(a: i32) -> f32;
   ```
   but the generated LLVM IR is invalid, because it has wrong signature for the intrinsic ([Godbolt](https://godbolt.org/z/6ff9hrcd5), adding `-Zverify-llvm-ir` to it will fail compilation). I would expect this code to not compile at all instead of generating invalid IR.
 - LLVM intrinsics that have types in their signature that can't be accessed from Rust (notable examples are the AMX intrinsics that have the `x86amx` type, and (almost) all intrinsics that have vectors of `i1` types) can't be linked to at all. This is a (major?) roadblock in the AMX and AVX512 support in stdarch.
 - If code uses an non-existing LLVM intrinsic, even `-Zverify-llvm-ir` won't complain. Eventually it will error out due to the non-existing function (courtesy of the linker). I don't think this is a behavior we want.

## What this PR does

 - When linking to **non-overloaded** intrinsics, we use the function `LLVMIntrinsicGetType` to directly get the function type of the intrinsic from LLVM.
 - We then use this LLVM definition to _verify_ the Rust signature, and emit a proper error if it doesn't match, instead of silently emitting invalid IR.
 - Lint if linking to deprecated or invalid LLVM intrinsics

> [!NOTE]
> This PR only focuses on non-overloaded intrinsics, overloaded can be done in a future PR

Regardless, the undermentioned functionalities work for **all** intrinsics

 - If we can't find the intrinsic, we check if it has been `AutoUpgrade`d by LLVM. If not, that means it is an invalid intrinsic, and we error out.
 - Don't allow intrinsics from other archs to be declared, e.g. error out if an AArch64 intrinsic is declared when we are compiling for x86

### Pros

 - It is now not possible (or at least, it would require _significantly_ more leaps and bounds) to introduce invalid IR using **non-overloaded** LLVM intrinsics.
 - As we are now doing the matching of Rust signatures to LLVM intrinsics ourselves, we can now add bypasses to enable linking to such non-Rust types (e.g. matching 8192-bit vectors to `x86amx` and injecting `llvm.x86.cast.vector.to.tile` and `llvm.x86.cast.tile.to.vector`s in callsite)

> [!NOTE]
> I don't intend for these bypasses to be permanent. A better approach will be introducing a `bf16` type in Rust, and allowing `repr(simd)` with `bool`s to get Rust-native `i1xN`s. These are meant to be short-time, as I mentioned, "bypass"es. They shouldn't cause any major breakage even if removed, as `link_llvm_intrinsics` is perma-unstable.

   This PR adds bypasses for `bf16` (via `i16`), `bf16xN` (via `i16xN`) and `i1xN` (via `iM`, where `M` is the smallest power of 2  s.t. `M >= N`, unless `N <= 4`, where we use `M = 8`). This will unblock AVX512-VP2INTERSECT and a lot of bf16 intrinsics in stdarch. This PR also automatically destructures structs if the types don't exactly match (this is required for us to start emitting hard errors on mismmatches).

### Cons

 - This only works for non-overloaded intrinsics (at least for now). Improving this to work with overloaded intrinsics too will involve significantly more work.

# Possible ways to extend this to overloaded intrinsics (future)

## Parse the mangled intrinsic name to get the type parameters

LLVM has a stable mangling of intrinsic names with type parameters (in `LLVMIntrinsicCopyOverloadedName2`), so we can parse the name to get the type parameters, and then just do the same thing.

### Pros
 - For _most_ intrinsics, this will work perfectly, and is a easy way to do this.

### Cons
 - The LLVM mangling is not perfectly reversible. When we have `TargetExt` types or identified structs, their name is a part of the mangling, making it impossible to reverse. Even more complexities arise when there are unnamed identified structs, as LLVM adds more mangling to the names.
 - @nikic's work on LLVM intrinsics will remove the name mangling, making this approach impossible

## Use the `IITDescriptor` table and the Rust function signature

We can use the base name to get the `IITDescriptor`s of the corresponding intrinsic, and then manually implement the _matching_ logic based on the Rust signature.

### Pros

 - Doesn't have the above mentioned limitation of the parsing approach, has correct behavior even when there are identified structs and `TargetExt` types. Also, fun fact, Rust exports all struct types as literal structs (unless it is emitting LLVM IR, then it always uses named identified structs, with mangled names)

### Cons

 - **Doesn't** actually use the type parameters in the name, only uses the base name and the Rust signature to get the llvm signature (although we _can_ check that it is the correct name). It means there would be no way to (for example) link against `llvm.sqrt.bf16` until we have `bf16` types in Rust. Because if we are using `u16`s (or any other type) as `bf16`s, then the matcher will deduce that the signature is `u16 (u16)` not `bf16 (bf16)` (which would lead to an error because `u16` is not a valid type parameter for `llvm.sqrt`), even though the intended type parameter is specified in the name.
 - Much more complex, and hard to maintain as LLVM gets new `IITDescriptorKind`s

These 2 approaches might give different results for same function. Let's take
```rust
#[link_name = "llvm.is.constant.bf16"]
fn foo(a: u16) -> bool
```
The name-based approach will decide that the type parameter is `bf16`, and the LLVM signature is `i1 (bf16)` and will inject some bitcasts at callsite.
The `IITDescriptor`-based approach will decide that the LLVM signature is `i1 (u16)`, and will see that the name given doesn't match the expected name (`llvm.is.constant.u16`), and will error out.

Reviews are welcome, as this is my first time _actually_ contributing to `rustc`

@rustbot label T-compiler A-codegen A-LLVM
r? codegen
This commit is contained in:
Jonathan Brouwer
2026-04-13 20:19:54 +02:00
committed by GitHub
18 changed files with 602 additions and 29 deletions
+29
View File
@@ -211,3 +211,32 @@ pub(crate) struct FixedX18InvalidArch<'a> {
"enabling both `-Zpacked-stack` and the `backchain` target feature is incompatible with the default s390x ABI. Switch to s390x-unknown-none-softfloat if you need both attributes"
)]
pub(crate) struct PackedStackBackchainNeedsSoftfloat;
#[derive(Diagnostic)]
#[diag(
"intrinsic signature mismatch for `{$name}`: expected signature `{$llvm_fn_ty}`, found `{$rust_fn_ty}`"
)]
pub(crate) struct IntrinsicSignatureMismatch<'a> {
pub name: &'a str,
pub llvm_fn_ty: &'a str,
pub rust_fn_ty: &'a str,
#[primary_span]
pub span: Span,
}
#[derive(Diagnostic)]
#[diag("unknown LLVM intrinsic `{$name}`")]
pub(crate) struct UnknownIntrinsic<'a> {
pub name: &'a str,
#[primary_span]
pub span: Span,
}
#[derive(Diagnostic)]
#[diag("intrinsic `{$name}` cannot be used with target arch `{$target_arch}`")]
pub(crate) struct IntrinsicWrongArch<'a> {
pub name: &'a str,
pub target_arch: &'a str,
#[primary_span]
pub span: Span,
}
+263 -27
View File
@@ -1,6 +1,6 @@
use std::cmp::Ordering;
use std::ffi::c_uint;
use std::{assert_matches, ptr};
use std::{assert_matches, iter, ptr};
use rustc_abi::{
Align, BackendRepr, ExternAbi, Float, HasDataLayout, NumScalableVectors, Primitive, Size,
@@ -21,10 +21,11 @@
use rustc_middle::ty::{self, GenericArgsRef, Instance, SimdAlign, Ty, TyCtxt, TypingEnv};
use rustc_middle::{bug, span_bug};
use rustc_session::config::CrateType;
use rustc_session::lint::builtin::DEPRECATED_LLVM_INTRINSIC;
use rustc_span::{Span, Symbol, sym};
use rustc_symbol_mangling::{mangle_internal_symbol, symbol_name_for_instance_in_crate};
use rustc_target::callconv::PassMode;
use rustc_target::spec::Os;
use rustc_target::spec::{Arch, Os};
use tracing::debug;
use crate::abi::FnAbiLlvmExt;
@@ -36,7 +37,8 @@
use crate::context::CodegenCx;
use crate::declare::declare_raw_fn;
use crate::errors::{
AutoDiffWithoutEnable, AutoDiffWithoutLto, OffloadWithoutEnable, OffloadWithoutFatLTO,
AutoDiffWithoutEnable, AutoDiffWithoutLto, IntrinsicSignatureMismatch, IntrinsicWrongArch,
OffloadWithoutEnable, OffloadWithoutFatLTO, UnknownIntrinsic,
};
use crate::llvm::{self, Type, Value};
use crate::type_of::LayoutLlvmExt;
@@ -818,7 +820,7 @@ fn codegen_llvm_intrinsic_call(
&mut self,
instance: ty::Instance<'tcx>,
args: &[OperandRef<'tcx, Self::Value>],
is_cleanup: bool,
_is_cleanup: bool,
) -> Self::Value {
let tcx = self.tcx();
@@ -847,42 +849,29 @@ fn codegen_llvm_intrinsic_call(
llargument_tys.push(arg_layout.immediate_llvm_type(self));
}
let fn_ty = self.type_func(&llargument_tys, llreturn_ty);
let fn_ptr = if let Some(&llfn) = self.intrinsic_instances.borrow().get(&instance) {
llfn
} else {
let sym = tcx.symbol_name(instance).name;
// FIXME use get_intrinsic
let llfn = if let Some(llfn) = self.get_declared_value(sym) {
llfn
} else {
// Function addresses in Rust are never significant, allowing functions to
// be merged.
let llfn = declare_raw_fn(
self,
sym,
llvm::CCallConv,
llvm::UnnamedAddr::Global,
llvm::Visibility::Default,
fn_ty,
);
llfn
intrinsic_fn(self, sym, llreturn_ty, llargument_tys, instance)
};
self.intrinsic_instances.borrow_mut().insert(instance, llfn);
llfn
};
let fn_ty = self.get_type_of_global(fn_ptr);
let mut llargs = vec![];
for arg in args {
match arg.val {
OperandValue::ZeroSized => {}
OperandValue::Immediate(_) => llargs.push(arg.immediate()),
OperandValue::Immediate(a) => llargs.push(a),
OperandValue::Pair(a, b) => {
llargs.push(a);
llargs.push(b);
@@ -908,24 +897,38 @@ fn codegen_llvm_intrinsic_call(
}
debug!("call intrinsic {:?} with args ({:?})", instance, llargs);
let args = self.check_call("call", fn_ty, fn_ptr, &llargs);
for (dest_ty, arg) in iter::zip(self.func_params_types(fn_ty), &mut llargs) {
let src_ty = self.val_ty(arg);
assert!(
can_autocast(self, src_ty, dest_ty),
"Cannot match `{dest_ty:?}` (expected) with {src_ty:?} (found) in `{fn_ptr:?}"
);
*arg = autocast(self, arg, src_ty, dest_ty);
}
let llret = unsafe {
llvm::LLVMBuildCallWithOperandBundles(
self.llbuilder,
fn_ty,
fn_ptr,
args.as_ptr() as *const &llvm::Value,
args.len() as c_uint,
llargs.as_ptr(),
llargs.len() as c_uint,
ptr::dangling(),
0,
c"".as_ptr(),
)
};
if is_cleanup {
self.apply_attrs_to_cleanup_callsite(llret);
}
llret
let src_ty = self.val_ty(llret);
let dest_ty = llreturn_ty;
assert!(
can_autocast(self, dest_ty, src_ty),
"Cannot match `{src_ty:?}` (expected) with `{dest_ty:?}` (found) in `{fn_ptr:?}`"
);
autocast(self, llret, src_ty, dest_ty)
}
fn abort(&mut self) {
@@ -976,6 +979,239 @@ fn va_end(&mut self, va_list: &'ll Value) -> &'ll Value {
}
}
fn llvm_arch_for(rust_arch: &Arch) -> Option<&'static str> {
Some(match rust_arch {
Arch::AArch64 | Arch::Arm64EC => "aarch64",
Arch::AmdGpu => "amdgcn",
Arch::Arm => "arm",
Arch::Bpf => "bpf",
Arch::Hexagon => "hexagon",
Arch::LoongArch32 | Arch::LoongArch64 => "loongarch",
Arch::Mips | Arch::Mips32r6 | Arch::Mips64 | Arch::Mips64r6 => "mips",
Arch::Nvptx64 => "nvvm",
Arch::PowerPC | Arch::PowerPC64 => "ppc",
Arch::RiscV32 | Arch::RiscV64 => "riscv",
Arch::S390x => "s390",
Arch::SpirV => "spv",
Arch::Wasm32 | Arch::Wasm64 => "wasm",
Arch::X86 | Arch::X86_64 => "x86",
_ => return None, // fallback for unknown archs
})
}
fn can_autocast<'ll>(cx: &CodegenCx<'ll, '_>, rust_ty: &'ll Type, llvm_ty: &'ll Type) -> bool {
if rust_ty == llvm_ty {
return true;
}
match cx.type_kind(llvm_ty) {
// Some LLVM intrinsics return **non-packed** structs, but they can't be mimicked from Rust
// due to auto field-alignment in non-packed structs (packed structs are represented in LLVM
// as, well, packed structs, so they won't match with those either)
TypeKind::Struct if cx.type_kind(rust_ty) == TypeKind::Struct => {
let rust_element_tys = cx.struct_element_types(rust_ty);
let llvm_element_tys = cx.struct_element_types(llvm_ty);
if rust_element_tys.len() != llvm_element_tys.len() {
return false;
}
iter::zip(rust_element_tys, llvm_element_tys).all(
|(rust_element_ty, llvm_element_ty)| {
can_autocast(cx, rust_element_ty, llvm_element_ty)
},
)
}
TypeKind::Vector => {
let llvm_element_ty = cx.element_type(llvm_ty);
let element_count = cx.vector_length(llvm_ty) as u64;
if llvm_element_ty == cx.type_bf16() {
rust_ty == cx.type_vector(cx.type_i16(), element_count)
} else if llvm_element_ty == cx.type_i1() {
let int_width = element_count.next_power_of_two().max(8);
rust_ty == cx.type_ix(int_width)
} else {
false
}
}
TypeKind::BFloat => rust_ty == cx.type_i16(),
_ => false,
}
}
fn autocast<'ll>(
bx: &mut Builder<'_, 'll, '_>,
val: &'ll Value,
src_ty: &'ll Type,
dest_ty: &'ll Type,
) -> &'ll Value {
if src_ty == dest_ty {
return val;
}
match (bx.type_kind(src_ty), bx.type_kind(dest_ty)) {
// re-pack structs
(TypeKind::Struct, TypeKind::Struct) => {
let mut ret = bx.const_poison(dest_ty);
for (idx, (src_element_ty, dest_element_ty)) in
iter::zip(bx.struct_element_types(src_ty), bx.struct_element_types(dest_ty))
.enumerate()
{
let elt = bx.extract_value(val, idx as u64);
let casted_elt = autocast(bx, elt, src_element_ty, dest_element_ty);
ret = bx.insert_value(ret, casted_elt, idx as u64);
}
ret
}
// cast from the i1xN vector type to the primitive type
(TypeKind::Vector, TypeKind::Integer) if bx.element_type(src_ty) == bx.type_i1() => {
let vector_length = bx.vector_length(src_ty) as u64;
let int_width = vector_length.next_power_of_two().max(8);
let val = if vector_length == int_width {
val
} else {
// zero-extends vector
let shuffle_indices = match vector_length {
0 => unreachable!("zero length vectors are not allowed"),
1 => vec![0, 1, 1, 1, 1, 1, 1, 1],
2 => vec![0, 1, 2, 2, 2, 2, 2, 2],
3 => vec![0, 1, 2, 3, 3, 3, 3, 3],
4.. => (0..int_width as i32).collect(),
};
let shuffle_mask =
shuffle_indices.into_iter().map(|i| bx.const_i32(i)).collect::<Vec<_>>();
bx.shuffle_vector(val, bx.const_null(src_ty), bx.const_vector(&shuffle_mask))
};
bx.bitcast(val, dest_ty)
}
// cast from the primitive type to the i1xN vector type
(TypeKind::Integer, TypeKind::Vector) if bx.element_type(dest_ty) == bx.type_i1() => {
let vector_length = bx.vector_length(dest_ty) as u64;
let int_width = vector_length.next_power_of_two().max(8);
let intermediate_ty = bx.type_vector(bx.type_i1(), int_width);
let intermediate = bx.bitcast(val, intermediate_ty);
if vector_length == int_width {
intermediate
} else {
let shuffle_mask: Vec<_> =
(0..vector_length).map(|i| bx.const_i32(i as i32)).collect();
bx.shuffle_vector(
intermediate,
bx.const_poison(intermediate_ty),
bx.const_vector(&shuffle_mask),
)
}
}
_ => bx.bitcast(val, dest_ty), // for `bf16(xN)` <-> `u16(xN)`
}
}
fn intrinsic_fn<'ll, 'tcx>(
bx: &Builder<'_, 'll, 'tcx>,
name: &str,
rust_return_ty: &'ll Type,
rust_argument_tys: Vec<&'ll Type>,
instance: ty::Instance<'tcx>,
) -> &'ll Value {
let tcx = bx.tcx;
let rust_fn_ty = bx.type_func(&rust_argument_tys, rust_return_ty);
let intrinsic = llvm::Intrinsic::lookup(name.as_bytes());
if let Some(intrinsic) = intrinsic
&& intrinsic.is_target_specific()
{
let (llvm_arch, _) = name[5..].split_once('.').unwrap();
let rust_arch = &tcx.sess.target.arch;
if let Some(correct_llvm_arch) = llvm_arch_for(rust_arch)
&& llvm_arch != correct_llvm_arch
{
tcx.dcx().emit_fatal(IntrinsicWrongArch {
name,
target_arch: rust_arch.desc(),
span: tcx.def_span(instance.def_id()),
});
}
}
if let Some(intrinsic) = intrinsic
&& !intrinsic.is_overloaded()
{
// FIXME: also do this for overloaded intrinsics
let llfn = intrinsic.get_declaration(bx.llmod, &[]);
let llvm_fn_ty = bx.get_type_of_global(llfn);
let llvm_return_ty = bx.get_return_type(llvm_fn_ty);
let llvm_argument_tys = bx.func_params_types(llvm_fn_ty);
let llvm_is_variadic = bx.func_is_variadic(llvm_fn_ty);
let is_correct_signature = !llvm_is_variadic
&& rust_argument_tys.len() == llvm_argument_tys.len()
&& iter::once((rust_return_ty, llvm_return_ty))
.chain(iter::zip(rust_argument_tys, llvm_argument_tys))
.all(|(rust_ty, llvm_ty)| can_autocast(bx, rust_ty, llvm_ty));
if !is_correct_signature {
tcx.dcx().emit_fatal(IntrinsicSignatureMismatch {
name,
llvm_fn_ty: &format!("{llvm_fn_ty:?}"),
rust_fn_ty: &format!("{rust_fn_ty:?}"),
span: tcx.def_span(instance.def_id()),
});
}
return llfn;
}
// Function addresses in Rust are never significant, allowing functions to be merged.
let llfn = declare_raw_fn(
bx,
name,
llvm::CCallConv,
llvm::UnnamedAddr::Global,
llvm::Visibility::Default,
rust_fn_ty,
);
if intrinsic.is_none() {
let mut new_llfn = None;
let can_upgrade = unsafe { llvm::LLVMRustUpgradeIntrinsicFunction(llfn, &mut new_llfn) };
if !can_upgrade {
// This is either plain wrong, or this can be caused by incompatible LLVM versions
tcx.dcx().emit_fatal(UnknownIntrinsic { name, span: tcx.def_span(instance.def_id()) });
} else if let Some(def_id) = instance.def_id().as_local() {
// we can emit diagnostics only for local crates
let hir_id = tcx.local_def_id_to_hir_id(def_id);
// not all intrinsics are upgraded to some other intrinsics, most are upgraded to instruction sequences
let msg = if let Some(new_llfn) = new_llfn {
format!(
"using deprecated intrinsic `{name}`, `{}` can be used instead",
str::from_utf8(&llvm::get_value_name(new_llfn)).unwrap()
)
} else {
format!("using deprecated intrinsic `{name}`")
};
tcx.emit_node_lint(
DEPRECATED_LLVM_INTRINSIC,
hir_id,
rustc_errors::DiagDecorator(|d| {
d.primary_message(msg).span(tcx.hir_span(hir_id));
}),
);
}
}
llfn
}
fn catch_unwind_intrinsic<'ll, 'tcx>(
bx: &mut Builder<'_, 'll, 'tcx>,
try_func: &'ll Value,
@@ -73,7 +73,6 @@ pub(crate) fn LLVMRustGetFunctionCall(
pub(crate) fn LLVMDumpModule(M: &Module);
pub(crate) fn LLVMDumpValue(V: &Value);
pub(crate) fn LLVMGetFunctionCallConv(F: &Value) -> c_uint;
pub(crate) fn LLVMGetReturnType(T: &Type) -> &Type;
pub(crate) fn LLVMGetParams(Fnc: &Value, params: *mut &Value);
pub(crate) fn LLVMGetNamedFunction(M: &Module, Name: *const c_char) -> Option<&Value>;
}
@@ -921,6 +921,9 @@ pub(crate) fn LLVMGetInlineAsm<'ll>(
pub(crate) fn LLVMDoubleTypeInContext(C: &Context) -> &Type;
pub(crate) fn LLVMFP128TypeInContext(C: &Context) -> &Type;
// Operations on non-IEEE real types
pub(crate) fn LLVMBFloatTypeInContext(C: &Context) -> &Type;
// Operations on function types
pub(crate) fn LLVMFunctionType<'a>(
ReturnType: &'a Type,
@@ -930,6 +933,8 @@ pub(crate) fn LLVMFunctionType<'a>(
) -> &'a Type;
pub(crate) fn LLVMCountParamTypes(FunctionTy: &Type) -> c_uint;
pub(crate) fn LLVMGetParamTypes<'a>(FunctionTy: &'a Type, Dest: *mut &'a Type);
pub(crate) fn LLVMGetReturnType(FunctionTy: &Type) -> &Type;
pub(crate) fn LLVMIsFunctionVarArg(FunctionTy: &Type) -> Bool;
// Operations on struct types
pub(crate) fn LLVMStructTypeInContext<'a>(
@@ -1084,12 +1089,18 @@ pub(crate) fn LLVMAddFunction<'a>(
// Operations about llvm intrinsics
pub(crate) fn LLVMLookupIntrinsicID(Name: *const c_char, NameLen: size_t) -> c_uint;
pub(crate) fn LLVMIntrinsicIsOverloaded(ID: NonZero<c_uint>) -> Bool;
pub(crate) fn LLVMGetIntrinsicDeclaration<'a>(
Mod: &'a Module,
ID: NonZero<c_uint>,
ParamTypes: *const &'a Type,
ParamCount: size_t,
) -> &'a Value;
pub(crate) fn LLVMRustUpgradeIntrinsicFunction<'a>(
Fn: &'a Value,
NewFn: &mut Option<&'a Value>,
) -> bool;
pub(crate) fn LLVMRustIsTargetIntrinsic(ID: NonZero<c_uint>) -> bool;
// Operations on parameters
pub(crate) fn LLVMIsAArgument(Val: &Value) -> Option<&Value>;
@@ -1605,6 +1616,9 @@ pub(crate) fn LLVMStructSetBody<'a>(
Packed: Bool,
);
pub(crate) fn LLVMCountStructElementTypes(StructTy: &Type) -> c_uint;
pub(crate) fn LLVMGetStructElementTypes<'a>(StructTy: &'a Type, Dest: *mut &'a Type);
pub(crate) safe fn LLVMMetadataAsValue<'a>(C: &'a Context, MD: &'a Metadata) -> &'a Value;
pub(crate) safe fn LLVMSetUnnamedAddress(Global: &Value, UnnamedAddr: UnnamedAddr);
@@ -323,6 +323,14 @@ pub(crate) fn lookup(name: &[u8]) -> Option<Self> {
NonZero::new(id).map(|id| Self { id })
}
pub(crate) fn is_overloaded(self) -> bool {
unsafe { LLVMIntrinsicIsOverloaded(self.id).is_true() }
}
pub(crate) fn is_target_specific(self) -> bool {
unsafe { LLVMRustIsTargetIntrinsic(self.id) }
}
pub(crate) fn get_declaration<'ll>(
self,
llmod: &'ll Module,
+22
View File
@@ -77,6 +77,10 @@ pub(crate) fn add_func(&self, name: &str, ty: &'ll Type) -> &'ll Value {
unsafe { llvm::LLVMAddFunction(self.llmod(), name.as_ptr(), ty) }
}
pub(crate) fn get_return_type(&self, ty: &'ll Type) -> &'ll Type {
unsafe { llvm::LLVMGetReturnType(ty) }
}
pub(crate) fn func_params_types(&self, ty: &'ll Type) -> Vec<&'ll Type> {
unsafe {
let n_args = llvm::LLVMCountParamTypes(ty) as usize;
@@ -86,6 +90,20 @@ pub(crate) fn func_params_types(&self, ty: &'ll Type) -> Vec<&'ll Type> {
args
}
}
pub(crate) fn func_is_variadic(&self, ty: &'ll Type) -> bool {
unsafe { llvm::LLVMIsFunctionVarArg(ty).is_true() }
}
pub(crate) fn struct_element_types(&self, ty: &'ll Type) -> Vec<&'ll Type> {
unsafe {
let n_args = llvm::LLVMCountStructElementTypes(ty) as usize;
let mut args = Vec::with_capacity(n_args);
llvm::LLVMGetStructElementTypes(ty, args.as_mut_ptr());
args.set_len(n_args);
args
}
}
}
impl<'ll, 'tcx> CodegenCx<'ll, 'tcx> {
pub(crate) fn type_bool(&self) -> &'ll Type {
@@ -165,6 +183,10 @@ pub(crate) fn type_struct(&self, els: &[&'ll Type], packed: bool) -> &'ll Type {
)
}
}
pub(crate) fn type_bf16(&self) -> &'ll Type {
unsafe { llvm::LLVMBFloatTypeInContext(self.llcx()) }
}
}
impl<'ll, CX: Borrow<SCx<'ll>>> BaseTypeCodegenMethods for GenericCx<'ll, CX> {
+46
View File
@@ -37,6 +37,7 @@
DEPENDENCY_ON_UNIT_NEVER_TYPE_FALLBACK,
DEPRECATED,
DEPRECATED_IN_FUTURE,
DEPRECATED_LLVM_INTRINSIC,
DEPRECATED_SAFE_2024,
DEPRECATED_WHERE_CLAUSE_LOCATION,
DUPLICATE_FEATURES,
@@ -5597,3 +5598,48 @@
report_in_deps: false,
};
}
declare_lint! {
/// The `deprecated_llvm_intrinsic` lint detects usage of deprecated LLVM intrinsics.
///
/// ### Example
///
/// ```rust,ignore (requires x86)
/// #![cfg(any(target_arch = "x86", target_arch = "x86_64"))]
/// #![feature(link_llvm_intrinsics, abi_unadjusted)]
/// #![deny(deprecated_llvm_intrinsic)]
///
/// unsafe extern "unadjusted" {
/// #[link_name = "llvm.x86.addcarryx.u32"]
/// fn foo(a: u8, b: u32, c: u32, d: &mut u32) -> u8;
/// }
///
/// #[inline(never)]
/// #[target_feature(enable = "adx")]
/// pub fn bar(a: u8, b: u32, c: u32, d: &mut u32) -> u8 {
/// unsafe { foo(a, b, c, d) }
/// }
/// ```
///
/// This will produce:
///
/// ```text
/// error: Using deprecated intrinsic `llvm.x86.addcarryx.u32`
/// --> example.rs:7:5
/// |
/// 7 | fn foo(a: u8, b: u32, c: u32, d: &mut u32) -> u8;
/// | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/// |
/// ```
///
/// ### Explanation
///
/// LLVM periodically updates its list of intrinsics. Deprecated intrinsics are unlikely
/// to be removed, but they may optimize less well than their new versions, so it's
/// best to use the new version. Also, some deprecated intrinsics might have buggy
/// behavior
pub DEPRECATED_LLVM_INTRINSIC,
Allow,
"detects uses of deprecated LLVM intrinsics",
@feature_gate = link_llvm_intrinsics;
}
@@ -9,6 +9,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/BinaryFormat/Magic.h"
#include "llvm/Bitcode/BitcodeWriter.h"
#include "llvm/IR/AutoUpgrade.h"
#include "llvm/IR/DIBuilder.h"
#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/IR/DiagnosticHandler.h"
@@ -1815,6 +1816,19 @@ extern "C" void LLVMRustSetNoSanitizeHWAddress(LLVMValueRef Global) {
GV.setSanitizerMetadata(MD);
}
extern "C" bool LLVMRustUpgradeIntrinsicFunction(LLVMValueRef Fn,
LLVMValueRef *NewFn) {
Function *F = unwrap<Function>(Fn);
Function *NewF = nullptr;
bool CanUpgrade = UpgradeIntrinsicFunction(F, NewF, false);
*NewFn = wrap(NewF);
return CanUpgrade;
}
extern "C" bool LLVMRustIsTargetIntrinsic(unsigned ID) {
return Intrinsic::isTargetIntrinsic(ID);
}
// Statically assert that the fixed metadata kind IDs declared in
// `metadata_kind.rs` match the ones actually used by LLVM.
#define FIXED_MD_KIND(VARIANT, VALUE) \
+93
View File
@@ -0,0 +1,93 @@
//@ compile-flags: -C opt-level=0 -C target-feature=+kl,+avx512vp2intersect,+avx512vl,+avxneconvert
//@ only-x86_64
#![feature(link_llvm_intrinsics, abi_unadjusted, simd_ffi, portable_simd)]
#![crate_type = "lib"]
use std::simd::{f32x4, i16x8, i64x2};
#[repr(C, packed)]
pub struct Bar(u32, i64x2, i64x2, i64x2, i64x2, i64x2, i64x2);
// CHECK: %Bar = type <{ i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> }>
// CHECK-LABEL: @struct_autocast
#[no_mangle]
pub unsafe fn struct_autocast(key_metadata: u32, key: i64x2) -> Bar {
extern "unadjusted" {
#[link_name = "llvm.x86.encodekey128"]
fn foo(key_metadata: u32, key: i64x2) -> Bar;
}
// CHECK: [[A:%[0-9]+]] = call { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } @llvm.x86.encodekey128(i32 {{.*}}, <2 x i64> {{.*}})
// CHECK: [[B:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 0
// CHECK: [[C:%[0-9]+]] = insertvalue %Bar poison, i32 [[B]], 0
// CHECK: [[D:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 1
// CHECK: [[E:%[0-9]+]] = insertvalue %Bar [[C]], <2 x i64> [[D]], 1
// CHECK: [[F:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 2
// CHECK: [[G:%[0-9]+]] = insertvalue %Bar [[E]], <2 x i64> [[F]], 2
// CHECK: [[H:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 3
// CHECK: [[I:%[0-9]+]] = insertvalue %Bar [[G]], <2 x i64> [[H]], 3
// CHECK: [[J:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 4
// CHECK: [[K:%[0-9]+]] = insertvalue %Bar [[I]], <2 x i64> [[J]], 4
// CHECK: [[L:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 5
// CHECK: [[M:%[0-9]+]] = insertvalue %Bar [[K]], <2 x i64> [[L]], 5
// CHECK: [[N:%[0-9]+]] = extractvalue { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } [[A]], 6
// CHECK: insertvalue %Bar [[M]], <2 x i64> [[N]], 6
foo(key_metadata, key)
}
// CHECK-LABEL: @struct_with_i1_vector_autocast
#[no_mangle]
pub unsafe fn struct_with_i1_vector_autocast(a: i64x2, b: i64x2) -> (u8, u8) {
extern "unadjusted" {
#[link_name = "llvm.x86.avx512.vp2intersect.q.128"]
fn foo(a: i64x2, b: i64x2) -> (u8, u8);
}
// CHECK: [[A:%[0-9]+]] = call { <2 x i1>, <2 x i1> } @llvm.x86.avx512.vp2intersect.q.128(<2 x i64> {{.*}}, <2 x i64> {{.*}})
// CHECK: [[B:%[0-9]+]] = extractvalue { <2 x i1>, <2 x i1> } [[A]], 0
// CHECK: [[C:%[0-9]+]] = shufflevector <2 x i1> [[B]], <2 x i1> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2>
// CHECK: [[D:%[0-9]+]] = bitcast <8 x i1> [[C]] to i8
// CHECK: [[E:%[0-9]+]] = insertvalue { i8, i8 } poison, i8 [[D]], 0
// CHECK: [[F:%[0-9]+]] = extractvalue { <2 x i1>, <2 x i1> } [[A]], 1
// CHECK: [[G:%[0-9]+]] = shufflevector <2 x i1> [[F]], <2 x i1> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 2, i32 2, i32 2, i32 2, i32 2>
// CHECK: [[H:%[0-9]+]] = bitcast <8 x i1> [[G]] to i8
// CHECK: insertvalue { i8, i8 } [[E]], i8 [[H]], 1
foo(a, b)
}
// CHECK-LABEL: @i1_vector_autocast
#[no_mangle]
pub unsafe fn i1_vector_autocast(a: u8, b: u8) -> u8 {
extern "unadjusted" {
#[link_name = "llvm.x86.avx512.kadd.b"]
fn foo(a: u8, b: u8) -> u8;
}
// CHECK: [[A:%[0-9]+]] = bitcast i8 {{.*}} to <8 x i1>
// CHECK: [[B:%[0-9]+]] = bitcast i8 {{.*}} to <8 x i1>
// CHECK: [[C:%[0-9]+]] = call <8 x i1> @llvm.x86.avx512.kadd.b(<8 x i1> [[A]], <8 x i1> [[B]])
// CHECK: bitcast <8 x i1> [[C]] to i8
foo(a, b)
}
// CHECK-LABEL: @bf16_vector_autocast
#[no_mangle]
pub unsafe fn bf16_vector_autocast(a: f32x4) -> i16x8 {
extern "unadjusted" {
#[link_name = "llvm.x86.vcvtneps2bf16128"]
fn foo(a: f32x4) -> i16x8;
}
// CHECK: [[A:%[0-9]+]] = call <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float> {{.*}})
// CHECK: bitcast <8 x bfloat> [[A]] to <8 x i16>
foo(a)
}
// CHECK: declare { i32, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64>, <2 x i64> } @llvm.x86.encodekey128(i32, <2 x i64>)
// CHECK: declare { <2 x i1>, <2 x i1> } @llvm.x86.avx512.vp2intersect.q.128(<2 x i64>, <2 x i64>)
// CHECK: declare <8 x i1> @llvm.x86.avx512.kadd.b(<8 x i1>, <8 x i1>)
// CHECK: declare <8 x bfloat> @llvm.x86.vcvtneps2bf16128(<4 x float>)
+1 -1
View File
@@ -35,7 +35,7 @@ pub fn foo(x: f32x4) -> f32x4 {
fn integer(a: i32x4, b: i32x4) -> i32x4;
// vmaxq_s32
#[cfg(target_arch = "aarch64")]
#[link_name = "llvm.aarch64.neon.maxs.v4i32"]
#[link_name = "llvm.aarch64.neon.smax.v4i32"]
fn integer(a: i32x4, b: i32x4) -> i32x4;
// Use a generic LLVM intrinsic to do type checking on other platforms
@@ -0,0 +1,28 @@
//@ add-minicore
//@ build-fail
//@ compile-flags: --target aarch64-unknown-linux-gnu
//@ needs-llvm-components: aarch64
//@ ignore-backends: gcc
#![feature(no_core, lang_items, link_llvm_intrinsics, abi_unadjusted, repr_simd, simd_ffi)]
#![no_std]
#![no_core]
#![allow(internal_features, non_camel_case_types, improper_ctypes)]
#![crate_type = "lib"]
extern crate minicore;
use minicore::*;
#[repr(simd)]
pub struct i8x8([i8; 8]);
extern "unadjusted" {
#[deny(deprecated_llvm_intrinsic)]
#[link_name = "llvm.aarch64.neon.rbit.v8i8"]
fn foo(a: i8x8) -> i8x8;
//~^ ERROR: using deprecated intrinsic `llvm.aarch64.neon.rbit.v8i8`, `llvm.bitreverse.v8i8` can be used instead
}
#[target_feature(enable = "neon")]
pub unsafe fn bar(a: i8x8) -> i8x8 {
foo(a)
}
@@ -0,0 +1,14 @@
error: using deprecated intrinsic `llvm.aarch64.neon.rbit.v8i8`, `llvm.bitreverse.v8i8` can be used instead
--> $DIR/deprecated-llvm-intrinsic.rs:21:5
|
LL | fn foo(a: i8x8) -> i8x8;
| ^^^^^^^^^^^^^^^^^^^^^^^^
|
note: the lint level is defined here
--> $DIR/deprecated-llvm-intrinsic.rs:19:12
|
LL | #[deny(deprecated_llvm_intrinsic)]
| ^^^^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to 1 previous error
@@ -0,0 +1,18 @@
//@ build-fail
//@ ignore-s390x
//@ normalize-stderr: "target arch `(.*)`" -> "target arch `TARGET_ARCH`"
//@ ignore-backends: gcc
#![feature(link_llvm_intrinsics, abi_unadjusted)]
extern "unadjusted" {
#[link_name = "llvm.s390.sfpc"]
fn foo(a: i32);
//~^ ERROR: intrinsic `llvm.s390.sfpc` cannot be used with target arch
}
pub fn main() {
unsafe {
foo(0);
}
}
@@ -0,0 +1,8 @@
error: intrinsic `llvm.s390.sfpc` cannot be used with target arch `TARGET_ARCH`
--> $DIR/incorrect-arch-intrinsic.rs:10:5
|
LL | fn foo(a: i32);
| ^^^^^^^^^^^^^^^
error: aborting due to 1 previous error
@@ -0,0 +1,14 @@
//@ build-fail
//@ ignore-backends: gcc
#![feature(link_llvm_intrinsics, abi_unadjusted)]
extern "unadjusted" {
#[link_name = "llvm.assume"]
fn foo();
//~^ ERROR: intrinsic signature mismatch for `llvm.assume`: expected signature `void (i1)`, found `void ()`
}
pub fn main() {
unsafe { foo() }
}
@@ -0,0 +1,8 @@
error: intrinsic signature mismatch for `llvm.assume`: expected signature `void (i1)`, found `void ()`
--> $DIR/incorrect-llvm-intrinsic-signature.rs:8:5
|
LL | fn foo();
| ^^^^^^^^^
error: aborting due to 1 previous error
@@ -0,0 +1,14 @@
//@ build-fail
//@ ignore-backends: gcc
#![feature(link_llvm_intrinsics, abi_unadjusted)]
extern "unadjusted" {
#[link_name = "llvm.abcde"]
fn foo();
//~^ ERROR: unknown LLVM intrinsic `llvm.abcde`
}
pub fn main() {
unsafe { foo() }
}
@@ -0,0 +1,8 @@
error: unknown LLVM intrinsic `llvm.abcde`
--> $DIR/unknown-llvm-intrinsic.rs:8:5
|
LL | fn foo();
| ^^^^^^^^^
error: aborting due to 1 previous error