`switch` statements on types >128bits are now lowered to conditionals.
This is necessary because Zig lowers integers with more than 128 bits to
bigints, which are not 'native' integers and thus cannot be used as case
values.
128-bit integers get special treatment because Zig will emit actual 128bit
ints if they are supported by the target. They still *may* be lowered to
bigints though, e.g. for 32-bit targets or MSVC. To solve this, this commit
adds a bunch of switch macros to `zig.h` which will either resolve to an
actual `switch` statement or to conditionals. The `if` statements this
approach can generate are not as optimal as they could be but I think this
is a good trade-off since the generated `switch` statements are still the
same as the ones generated for smaller integers. Also the macros result
in pretty readable code.
I went over a bunch of calls to `std.zig.llvm.Builder.{load,store}` and
added correct alignments where we previously passed the default
alignment. This fixes some miscompilations, particularly when using
underaligned pointers or *overaligned* slices. I have definitely missed
some cases, but it's better to get some fixes in than for this to sit in
a `git stash` forever.
Resolves: https://codeberg.org/ziglang/zig/issues/31473
Resolves: https://codeberg.org/ziglang/zig/issues/30566
It was fairly straightforward to at least theoretically handle
incremental updates to the fields of an enum type correctly: we just
build a new set of instructions, and `std.zig.llvm.Builder` already
knows how to replace the old function body with the new one when we call
`WipFunction.finish`.
Also tightened a few error sets.
I've realised that the cause of at least some of our weird CI flakiness
was a bug in how `Nav` values were resolved. Consider this scenario: the
frontend resolves the type of a `Nav`, and then sends a function to the
backend, which requires the backend to lower a pointer to that `Nav`.
The backend calls `InternPool.getNav` to determine the `Nav`'s type.
However, this races with the frontend resolving the *value* of that
`Nav`. This involves writing separately to two fields, `bits` and
`type_or_value`. If only one of these changes is observed, then the
backend will incorrectly interpret the type as the value or vice versa,
leading to a crash or even a miscompilation. (Of course, there's also
the straightforward issue that the racing loads were non-atomic, making
them illegal).
The only good solution to this was to make `Nav` 4 bytes bigger, giving
it separate `type` and `value` fields. In theory that's a quite small
change, but it ended up having a bunch of nice consequences which led to
this diff being a bit bulkier than expected:
* `Nav.Repr.Bits` was simplified, because it no longer has to track
"resolution status": we can use `.none` for that. This frees up some
bits to make things more consistent between the "type resolved" and
"fully resolved" states.
* This consistency allowed the `Nav.status` union to be replaced with a
simpler field `Nav.resolved`, which is a bit nicer to work with.
* Most of the "getter" functions were able to be removed from `Nav`
because the state they were fetching had been moved to simple fields
on `Nav.resolved`.
* There were still a handful of free bits in `Nav.Repr.Bits`, which
could be used to represent the "const" and "threadlocal" flags rather
than these being stored on `Key.Extern` and `Key.Variable`. This is a
bit more convenient for linkers.
* With those bits gone, `Key.Variable` is a trivial wrapper around a
type and an initial value, and the fact that a declaration is mutable
can be represented solely through the "const" flag. Therefore,
`Key.Variable` no longer served a purpose, and could be eliminated
entirely in favour of storing the variable's initial value directly in
the "value" field of the `Nav`.
So, I'm quite pleased with this refactor! But anyway, regarding the bug
fix which actually motivated this: if I've done my job correctly, this
should solve some crashes, such as these (which were what tipped me off
to this bug in the first place):
https://codeberg.org/ziglang/zig/actions/runs/2306/jobs/7/attempt/1https://codeberg.org/ziglang/zig/actions/runs/2173/jobs/6/attempt/1
...and, who knows, perhaps even the random SIGSEGVs we've seen on some
targets! Probably not, but one can hope.
This PR started as addressing the long-standing TODO above `buildOpcode`:
/// TODO: deprecated, should be split up per tag.
The code around this area was written a long time ago and has effectively
become a legacy approach. When I started doing semi-occasional work for
bringing >128-bit integer operations to the wasm backend, which is
the last big missing piece of this backend, this design became annoying.
While thinking about how to support that work, and also how vector unrolling
should be handled (in cases where we do not rely purely on the legalize pass),
I decided to do some architectural changes were needed.
The first step is removing helpers like `intBinOp`, `floatBinOp`, `UnOp`, etc.
They do not really capture all operations and resulted in a lot of small
pieces of code trying to artificially unify different ops.
Instead, the direction taken here is similar to `Sema/arith.zig`, introduce
backend-oriented helpers such as `int*Op*Scalar` and `float*Op*` that operate
purely in backend structures without referencing AIR at all.
Additionally, the idea of introducing dedicated `IntType` and `FloatType`
types was chosen. Using `Type` from Sema inside the backend is awkward,
especially when strange or temporary types are needed. Creating them through
`PerThread` is also undesirable since the backend strives to not modify
`InternPool`. This goal is not fully achieved yet, some parts still require
changes, and `InternPool` type formatting still requires `pt` for error
reporting.
This PR also enables legalize passes for some packed operations. The previous
code in this area was buggy, and given the current state of the backend,
relying on legalization is simpler.
Finally, this PR disables one behavior test: `atomicrmw` with floats. The test
seems to only run the non-concurrency path, because it would crash otherwise,
and since we do not currently run behavior tests for the self-hosted backend
with concurrency or atomics enabled, it does not provide meaningful coverage yet.
In summary, this refactor reworks instruction selection in the wasm backend
to simplify the code and make future work, especially adding big integer
support.
Since packed containers are now represented by `bitpack` and can't include
pointers anymore this has become a very easy change to make. This commit
largely just reuses the logic already in place for integers.
Also fixes a small bug where captures-by-ref of errors wouldn't cause a
compile error for regular switch statements. There was already an astgen
error in place for error handling switch statements (`switch_block_err_union`)
capturing their error by reference.
I previously wrote some weird code in the compiler frontend solely
because the LLVM backend has some weird requirements, but the better
solution is to avoid those requirements. This commit does that by
introducing "alignment forward references" to `std.zig.llvm.Builder`.
Much like debug forward references, they allow you to reference an
alignment value which will be populated at a later time (and which can
be updated many times, which is important for incremental compilation).
Then, when we want to reference a type's ABI alignment while the type is
not necessarily resolved (required for `@"align"` attributes on function
parameters and function call arguments), we create a forward reference
and use `link.ConstPool` to populate it when ready.
This allows us to remove from the compiler frontend some extremely
arbitrary calls to `Sema.ensureLayoutResolved`, so that the language
specification is not being built around the particular needs of our
compiler implementation's LLVM code generation backend.
The change in codegen/x86_64/CodeGen.zig was not strictly necessary (the
Sema change I did solves the error I was getting there), I just think
it's better style anyway.
The goal of these changes is to allow the C backend to support the new
lazier type resolution system implemented by the frontend. This required
a full rewrite of the `CType` abstraction, and major changes to the C
backend "linker".
The `DebugConstPool` abstraction introduced in a previous commit turns
out to be useful for the C backend to codegen types. Because this use
case is not debug information but rather general linking (albeit when
targeting an unusual object format), I have renamed the abstraction to
`ConstPool`. With it, the C linker is told when a type's layout becomes
known, and can at that point generate the corresponding C definitions,
rather than deferring this work until `flush`.
The work done in `flush` is now more-or-less *solely* focused on
collecting all of the buffers into a big array for a vectored write.
This does unfortunately involve a non-trivial graph traversal to emit
type definitions in an appropriate order, but it's still quite fast in
practice, and it operates on fairly compact dependency data. We don't
generate the actual type *definitions* in `flush`; that happens during
compilation using `ConstPool` as discussed above. (We do generate the
typedefs for underaligned types in `flush`, but that's a trivial amount
of work in most cases.)
`CType` is now an ephemeral type: it is created only when we render a
type (the logic for which has been pushed into just 2 or 3 functions in
`codegen.c`---most of the backend now operates on unmolested Zig `Type`s
instead). C types are no longer stored in a "pool", although the type
"dependencies" of generated C code (that is, the struct, unions, and
typedefs which the generated code references) are tracked (in some
simple hash sets) and given to the linker so it can codegen the types.
Most importantly, adds support for `DW_TAG_typedef` to `llvm.Builder`,
and uses it to define error sets and optional pointers/errors.
Also deletes some random dead code I found.
The LLVM backend can now run the behavior tests and standard library
tests, like the x86_64 backend can. This commit required me to make a
lot of changes to how the LLVM backend lowers debug information, and
while I was doing that, I improved a few things:
* `anyerror` is now an enum type (and other error sets just wrap it), so
error values appear by name in debuggers
* Fixed broken lowering for tagged unions with zero-width payloads
* Associate container types with source locations in all cases
* Avoid depending on the order of type resolution (using the new
`DebugConstPool` abstraction), so debug information will contain all
available type information rather than just the subset which happens
to be resolved when the backend lowers that debug type
This actually doesn't cause any dependency loops in std, which is pretty
much my benchmark for it being acceptable. This can be reverted if it
turns out to be problematic, but for now, let's err on the side of
language simplicity.
To be clear, this *does* regress some cases which previously worked: I
will have to remove some behavior tests as a result of this commit. To
be honest, the tests which look to be failing as a result of this are
things which I think are generally unadvisable; I actually reckon a bit
more friction to use default field values in non-trivial ways might be a
good thing to stop people from misusing them as much. Struct fields
should very rarely have default values; about the only common situation
where they make sense is "options" structs.
Now that https://github.com/ziglang/zig/issues/24657 has been
implemented, the compiler can simplify its internal representation of
comptime-known `packed struct` and `packed union` values. Instead of
storing them field-wise, we can simply store their backing integer
value. This simplifies many operations and improves efficiency in some
cases.
Quoting Vol. 2B 4-590 of [Intel SSA manual][1]:
> Bit 3 of the immediate byte controls processor behavior for a
precision exception <...>
The Direction struct is incorrectly sized to 4 bits, which pushes the
precision exception bit to the reserved bits, which helpfully crashes
under valgrind (but works on real hardware, since CPU ignores it):
``
const std = @import("std");
noinline fn ceil(x: f32) f32 {
return @ceil(x);
}
test "ceil" {
try std.testing.expectEqual(@as(f32, 2.0), ceil(1.5));
}
```
With Zig 0.15.1:
```
$ zig test -fno-llvm -mcpu x86_64_v3 -fvalgrind ceil_test.zig --test-cmd valgrind --test-cmd --quiet --test-cmd-bin
Illegal instruction at address 0x102d14d
/home/motiejus/code/ceil_test.zig:4:5: 0x102d14d in ceil (ceil_test.zig)
return @ceil(x);
^
/home/motiejus/code/ceil_test.zig:8:52: 0x102d097 in test.ceil (ceil_test.zig)
try std.testing.expectEqual(@as(f32, 2.0), ceil(1.5));
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/compiler/test_runner.zig:218:25: 0x1167e80 in mainTerminal (test_runner.zig)
if (test_fn.func()) |_| {
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/compiler/test_runner.zig:66:28: 0x11610a1 in main (test_runner.zig)
return mainTerminal();
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/std/start.zig:618:22: 0x115ae3d in posixCallMainAndExit (std.zig)
root.main();
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/std/start.zig:232:5: 0x115a6d1 in _start (std.zig)
asm volatile (switch (native_arch) {
^
???:?:?: 0x0 in ??? (???)
error: the following test command crashed:
valgrind --quiet /home/motiejus/.cache/zig/o/7cc564b5410fc3facdb6e8768643074e/test
```
Zig 0.15.1 with this patch:
```
zig/zig3 test -fno-llvm -mcpu x86_64_v3 -fvalgrind ceil_test.zig --test-cmd valgrind --test-cmd --quiet --test-cmd-bin
All 1 tests passed.
```
Looking through `CodeGen.zig` it _seems_ like the same RoundMode struct
is used for vroundss/vroundps/vcvtps2ph, so update the comment while
we're at it. But I am only 90% sure it's true.
[1]: https://cdrdv2.intel.com/v1/dl/getContent/671200
-- On the standard library side:
The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.
`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.
`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
of values.
-- On the libfuzzer and abi side:
--- Uids
These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:
1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.
2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.
Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.
--- Mutations
At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.
Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.
For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.
--- Passive Minimization
A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.
The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.
Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.
-- Comparison to byte-based smith
A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).
-- Test updates
All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
When `NtReadFile` returns `SUCCESS`, the APC routine still runs when
next alertable, which was previously clobbering an out of scope `done`.
Instead of adding an extra syscall to the success path, avoid all APC
side effects, allowing instant completions to return immediately.
Previously, 64-bit '<<|' operations were emitting 64-bit shifts with one
64-bit operand and one 32-bit operand, which is illegal. Instead, as in
the lowering for regular shifts, we need to cast the RHS in this case.