Commit Graph

1278 Commits

Author SHA1 Message Date
Nathan Bourgeois 164d01c1dc codegen: notraps on mips causes break and infinite loop on @trap() 2026-03-21 21:32:22 -04:00
Nathan Bourgeois fdf19984b8 std.Target: add psp os 2026-03-21 02:56:24 -04:00
Matthew Lugg 85837de476 llvm: solve a bunch of alignment bugs
I went over a bunch of calls to `std.zig.llvm.Builder.{load,store}` and
added correct alignments where we previously passed the default
alignment. This fixes some miscompilations, particularly when using
underaligned pointers or *overaligned* slices. I have definitely missed
some cases, but it's better to get some fixes in than for this to sit in
a `git stash` forever.

Resolves: https://codeberg.org/ziglang/zig/issues/31473
Resolves: https://codeberg.org/ziglang/zig/issues/30566
2026-03-18 01:02:29 +01:00
Matthew Lugg 0a2f663281 llvm: solve a misc TODO
The compiler has had the desired semantics for comptime evaluation of
these checks for quite a while now.
2026-03-18 01:02:29 +01:00
Matthew Lugg 6ae30662dc llvm: incremental updates for "is named enum value" functions
It was fairly straightforward to at least theoretically handle
incremental updates to the fields of an enum type correctly: we just
build a new set of instructions, and `std.zig.llvm.Builder` already
knows how to replace the old function body with the new one when we call
`WipFunction.finish`.

Also tightened a few error sets.
2026-03-18 01:02:29 +01:00
Matthew Lugg 5d215838a7 InternPool.Nav: fix race, refactor
I've realised that the cause of at least some of our weird CI flakiness
was a bug in how `Nav` values were resolved. Consider this scenario: the
frontend resolves the type of a `Nav`, and then sends a function to the
backend, which requires the backend to lower a pointer to that `Nav`.
The backend calls `InternPool.getNav` to determine the `Nav`'s type.
However, this races with the frontend resolving the *value* of that
`Nav`. This involves writing separately to two fields, `bits` and
`type_or_value`. If only one of these changes is observed, then the
backend will incorrectly interpret the type as the value or vice versa,
leading to a crash or even a miscompilation. (Of course, there's also
the straightforward issue that the racing loads were non-atomic, making
them illegal).

The only good solution to this was to make `Nav` 4 bytes bigger, giving
it separate `type` and `value` fields. In theory that's a quite small
change, but it ended up having a bunch of nice consequences which led to
this diff being a bit bulkier than expected:

* `Nav.Repr.Bits` was simplified, because it no longer has to track
  "resolution status": we can use `.none` for that. This frees up some
  bits to make things more consistent between the "type resolved" and
  "fully resolved" states.

* This consistency allowed the `Nav.status` union to be replaced with a
  simpler field `Nav.resolved`, which is a bit nicer to work with.

* Most of the "getter" functions were able to be removed from `Nav`
  because the state they were fetching had been moved to simple fields
  on `Nav.resolved`.

* There were still a handful of free bits in `Nav.Repr.Bits`, which
  could be used to represent the "const" and "threadlocal" flags rather
  than these being stored on `Key.Extern` and `Key.Variable`. This is a
  bit more convenient for linkers.

* With those bits gone, `Key.Variable` is a trivial wrapper around a
  type and an initial value, and the fact that a declaration is mutable
  can be represented solely through the "const" flag. Therefore,
  `Key.Variable` no longer served a purpose, and could be eliminated
  entirely in favour of storing the variable's initial value directly in
  the "value" field of the `Nav`.

So, I'm quite pleased with this refactor! But anyway, regarding the bug
fix which actually motivated this: if I've done my job correctly, this
should solve some crashes, such as these (which were what tipped me off
to this bug in the first place):

https://codeberg.org/ziglang/zig/actions/runs/2306/jobs/7/attempt/1
https://codeberg.org/ziglang/zig/actions/runs/2173/jobs/6/attempt/1

...and, who knows, perhaps even the random SIGSEGVs we've seen on some
targets! Probably not, but one can hope.
2026-03-15 11:47:14 +00:00
Justus Klausecker 28886ca9ec Sema: implement switch for packed structs/unions
Since packed containers are now represented by `bitpack` and can't include
pointers anymore this has become a very easy change to make. This commit
largely just reuses the logic already in place for integers.

Also fixes a small bug where captures-by-ref of errors wouldn't cause a
compile error for regular switch statements. There was already an astgen
error in place for error handling switch statements (`switch_block_err_union`)
capturing their error by reference.
2026-03-11 21:04:04 +01:00
Justus Klausecker be9b42d707 compiler: allow equality comparisons for packed unions
This was already possible in practice by just wrapping a packed union into
a packed struct. Now it's also possible without doing that.
2026-03-11 16:44:08 +01:00
Andrew Kelley a388b88ed4 Merge pull request 'std.heap.ArenaAllocator: add fuzz test + some optimizations' (#31407) from justusk/zig:fuzz-arena into master
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31407
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
2026-03-11 03:00:07 +01:00
Matthew Lugg ea7e34224a compiler: don't call getUnionLayout on packed unions 2026-03-10 10:26:15 +00:00
Matthew Lugg c2b42383eb compiler,std: various little fixes 2026-03-10 10:26:14 +00:00
Matthew Lugg 4eb8360911 compiler: various lil' fixes 2026-03-10 10:26:14 +00:00
Matthew Lugg 0a246f5e67 compiler: stop LLVM bossing the frontend around
I previously wrote some weird code in the compiler frontend solely
because the LLVM backend has some weird requirements, but the better
solution is to avoid those requirements. This commit does that by
introducing "alignment forward references" to `std.zig.llvm.Builder`.
Much like debug forward references, they allow you to reference an
alignment value which will be populated at a later time (and which can
be updated many times, which is important for incremental compilation).
Then, when we want to reference a type's ABI alignment while the type is
not necessarily resolved (required for `@"align"` attributes on function
parameters and function call arguments), we create a forward reference
and use `link.ConstPool` to populate it when ready.

This allows us to remove from the compiler frontend some extremely
arbitrary calls to `Sema.ensureLayoutResolved`, so that the language
specification is not being built around the particular needs of our
compiler implementation's LLVM code generation backend.
2026-03-10 10:26:13 +00:00
Matthew Lugg 2651b5ccdf llvm: fix some bugs 2026-03-10 10:26:13 +00:00
Matthew Lugg 5cc12da1c0 cbe: rework CType and other major refactors
The goal of these changes is to allow the C backend to support the new
lazier type resolution system implemented by the frontend. This required
a full rewrite of the `CType` abstraction, and major changes to the C
backend "linker".

The `DebugConstPool` abstraction introduced in a previous commit turns
out to be useful for the C backend to codegen types. Because this use
case is not debug information but rather general linking (albeit when
targeting an unusual object format), I have renamed the abstraction to
`ConstPool`. With it, the C linker is told when a type's layout becomes
known, and can at that point generate the corresponding C definitions,
rather than deferring this work until `flush`.

The work done in `flush` is now more-or-less *solely* focused on
collecting all of the buffers into a big array for a vectored write.
This does unfortunately involve a non-trivial graph traversal to emit
type definitions in an appropriate order, but it's still quite fast in
practice, and it operates on fairly compact dependency data. We don't
generate the actual type *definitions* in `flush`; that happens during
compilation using `ConstPool` as discussed above. (We do generate the
typedefs for underaligned types in `flush`, but that's a trivial amount
of work in most cases.)

`CType` is now an ephemeral type: it is created only when we render a
type (the logic for which has been pushed into just 2 or 3 functions in
`codegen.c`---most of the backend now operates on unmolested Zig `Type`s
instead). C types are no longer stored in a "pool", although the type
"dependencies" of generated C code (that is, the struct, unions, and
typedefs which the generated code references) are tracked (in some
simple hash sets) and given to the linker so it can codegen the types.
2026-03-10 10:26:12 +00:00
Matthew Lugg 2b8feabb8f llvm: some more improvements to debug info
Most importantly, adds support for `DW_TAG_typedef` to `llvm.Builder`,
and uses it to define error sets and optional pointers/errors.

Also deletes some random dead code I found.
2026-03-10 10:26:12 +00:00
Matthew Lugg f7a1ccfc56 compiler: fix up LLVM backend, and improve its debug info
The LLVM backend can now run the behavior tests and standard library
tests, like the x86_64 backend can. This commit required me to make a
lot of changes to how the LLVM backend lowers debug information, and
while I was doing that, I improved a few things:

* `anyerror` is now an enum type (and other error sets just wrap it), so
  error values appear by name in debuggers

* Fixed broken lowering for tagged unions with zero-width payloads

* Associate container types with source locations in all cases

* Avoid depending on the order of type resolution (using the new
  `DebugConstPool` abstraction), so debug information will contain all
  available type information rather than just the subset which happens
  to be resolved when the backend lowers that debug type
2026-03-10 10:26:12 +00:00
Matthew Lugg 187fef209f compiler: rework OPV and noreturn-like types 2026-03-10 10:26:08 +00:00
Matthew Lugg b19074d252 compiler: represent bitpacks as their backing integer
Now that https://github.com/ziglang/zig/issues/24657 has been
implemented, the compiler can simplify its internal representation of
comptime-known `packed struct` and `packed union` values. Instead of
storing them field-wise, we can simply store their backing integer
value. This simplifies many operations and improves efficiency in some
cases.
2026-03-10 10:26:08 +00:00
Matthew Lugg 3086c7977b type resolution progress 2026-03-10 10:26:07 +00:00
Matthew Lugg 510ea6f61f type resolution progress 2026-03-10 10:26:07 +00:00
Justus Klausecker 823f9039f1 llvm: use atomic rmw to increment fuzzer pc counters
This change makes the fuzzer instrumentation emitted by Zig thread-safe.
2026-03-06 10:09:06 +01:00
Kendall Condon 5d58306162 rework fuzz testing to be smith based
-- On the standard library side:

The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.

`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
  guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.

`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
  possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
  of values.

-- On the libfuzzer and abi side:

--- Uids

These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:

1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.

2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.

Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.

--- Mutations

At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.

Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.

For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.

--- Passive Minimization

A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.

The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.

Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.

-- Comparison to byte-based smith

A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).

-- Test updates

All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
2026-02-13 22:12:19 -05:00
GasInfinity 87fa61bdd1 fix(codegen/llvm): teach llvm to not dllexport hidden exports 2026-02-11 10:51:26 +01:00
Mathieu Suen 36b65ab59e Air: add "unwrap" functions for loading extra data 2026-02-06 13:06:49 +00:00
Alex Rønne Petersen e9b442db5a llvm: fix C ABI integer promotion for more targets
Also stop pretending that this function handles _BitInt types correctly.

Handling 32-bit integers on MIPS properly is blocked on: https://github.com/llvm/llvm-project/issues/179088
2026-02-03 20:21:52 +01:00
Alex Rønne Petersen c6538b70f5 llvm: handle packed structs in C ABI integer promotion 2026-01-31 00:08:32 +01:00
Justus Klausecker 42dea36ce9 llvm: fix jump table gen for labeled switch with single else prong
Avoids a null unwrap if there are no cases with explicit values present
while trying to construct a jump table for a labeled switch statement.
2026-01-11 11:37:16 +00:00
Andrew Kelley a5b719e9eb compiler: fix build failures from std.Io-fs 2025-12-23 22:15:10 -08:00
Andrew Kelley f53248a409 update all std.fs.cwd() to std.Io.Dir.cwd() 2025-12-23 22:15:08 -08:00
Andrew Kelley aafddc2ea1 update all occurrences of close() to close(io) 2025-12-23 22:15:07 -08:00
Alex Rønne Petersen abc8ddcfa9 llvm: fix aliases not having linkage, visibility, etc set 2025-12-10 15:40:30 +01:00
Alex Rønne Petersen 16fc083f2b std.Target: remove Abi.code16
This functionality -- if it's actually needed -- can be reintroduced through
some other mechanism. An ABI is clearly not the right way to represent it.

closes #25918
2025-11-23 10:22:03 +01:00
Benjamin Jurk 4b5351bc0d update deprecated ArrayListUnmanaged usage (#25958) 2025-11-20 14:46:23 -08:00
Matthew Lugg a319211eee llvm: fix lowering of packed struct constants
The big-endian logic here was simply incorrect. Luckily, it was also
overcomplicated; after calling `Value.writeToPackedMemory`, there is a
method on `std.math.big.int.Mutable` which just does the correct
endianness load for us.
2025-11-19 01:42:37 +01:00
Prokop Randáček 94e98bfe80 Dedupe types when printing error messages 2025-11-16 16:20:45 +02:00
Matthew Lugg bc78d8efdb Legalize: implement soft-float legalizations
A new `Legalize.Feature` tag is introduced for each float bit width
(16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and
comparison operations on `f16` are converted to calls to the appropriate
compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`.
This includes casts where the source *or* target type is `f16`, or
integer<=>float conversions to or from `f16`. Occasionally, operations
are legalized to blocks because there is extra code required; for
instance, legalizing `@floatFromInt` where the integer type is larger
than 64 bits requires calling an arbitrary-width integer conversion
function which accepts a pointer to the integer, so we need to use
`alloc` to create such a pointer, and store the integer there (after
possibly zero-extending or sign-extending it).

No backend currently uses these new legalizations (and as such, no
backend currently needs to implement `.legalize_compiler_rt_call`).
However, for testing purposes, I tried modifying the self-hosted x86_64
backend to enable all of the soft-float features (and implement the AIR
instruction). This modified backend was able to pass all of the behavior
tests (except for one `@mod` test where the LLVM backend has a bug
resulting in incorrect compiler-rt behavior!), including the tests
specific to the self-hosted x86_64 backend.

`f16` and `f80` legalizations are likely of particular interest to
backend developers, because most architectures do not have instructions
to operate on these types. However, enabling *all* of these legalization
passes can be useful when developing a new backend to hit the ground
running and pass a good amount of tests more easily.
2025-11-15 09:49:01 +00:00
Alex Rønne Petersen 9ab7eec23e represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.

While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
2025-11-14 11:33:35 +01:00
Matthew Lugg 181b25ce4f Merge pull request #25772 from mlugg/kill-dead-code
compiler: rewrite some legalizations, and remove a bunch of dead code
2025-11-12 23:14:02 +00:00
Alex Rønne Petersen dfd7b7f233 std.Target: remove Abi.cygnus
There is approximately zero chance of the Zig team ever spending any effort on
supporting Cygwin; the MSVC and MinGW-w64 ABIs are superior in every way that
matters, and not least because they lead to binaries that just run natively on
Windows without needing a POSIX emulation environment installed.
2025-11-12 22:39:04 +01:00
Matthew Lugg 69f39868b4 Air.Legalize: revert to loops for scalarizations
I had tried unrolling the loops to avoid requiring the
`vector_store_elem` instruction, but it's arguably a problem to generate
O(N) code for an operation on `@Vector(N, T)`. In addition, that
lowering emitted a lot of `.aggregate_init` instructions, which is
itself a quite difficult operation to codegen.

This requires reintroducing runtime vector indexing internally. However,
I've put it in a couple of instructions which are intended only for use
by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`,
but for indexing a vector with a runtime-known index) and
`legalize_vec_store_elem` (like the old `vector_store_elem`
instruction). These are explicitly documented as *not* being emitted by
Sema, so need only be implemented by backends if they actually use an
`Air.Legalize.Feature` which emits them (otherwise they can be marked as
`unreachable`).
2025-11-12 16:00:16 +00:00
Matthew Lugg c091e27aac compiler: spring cleaning
I started this diff trying to remove a little dead code from the C
backend, but ended up finding a bunch of dead code sprinkled all over
the place:

* `packed` handling in the C backend which was made dead by `Legalize`
* Representation of pointers to runtime-known vector indices
* Handling for the `vector_store_elem` AIR instruction (now removed)
* Old tuple handling from when they used the InternPool repr of structs
* Straightforward unused functions
* TODOs in the LLVM backend for features which Zig just does not support
2025-11-12 16:00:15 +00:00
Alex Rønne Petersen d182c7e3bc Merge pull request #25886 from alexrp/kvx
beginnings of KVX target support (via CBE)
2025-11-10 16:38:23 +01:00
Alex Rønne Petersen cde865e06a llvm: set sub-arch for spirv 1.6 2025-11-10 12:54:04 +01:00
Alex Rønne Petersen 2c470d24b3 std.Target: add Arch tag and info for kvx 2025-11-10 08:20:21 +01:00
David Rubin 483f9bd367 llvm: add extra clobbers to valgrind requests
This seems to work around a very puzzling miscompilation first
present in LLVM 21.x. We already unconditionally add these
clobbers to inline assembly that came from the source, the
valgrind requests should also contain them.
2025-11-06 20:27:38 -08:00
David Rubin 09e4035e79 llvm: clobber rdx instead of edx for x86-64 valgrind request 2025-11-06 20:12:35 -08:00
Andrew Kelley a072d821be Merge pull request #25592 from ziglang/init-std.Io
std: Introduce `Io` Interface
2025-10-29 13:51:37 -07:00
Alex Rønne Petersen a7119d4269 remove all IBM AIX and z/OS support
As with Solaris (dba1bf9353), we have no way to
actually audit contributions for these OSs. IBM also makes it even harder than
Oracle to actually obtain these OSs.

closes #23695
closes #23694
closes #3655
closes #23693
2025-10-29 14:25:51 +01:00
Andrew Kelley 2bcdde2985 compiler: update for introduction of std.Io
only thing remaining is using libc dns resolution when linking libc
2025-10-29 06:20:49 -07:00