I've realised that the cause of at least some of our weird CI flakiness
was a bug in how `Nav` values were resolved. Consider this scenario: the
frontend resolves the type of a `Nav`, and then sends a function to the
backend, which requires the backend to lower a pointer to that `Nav`.
The backend calls `InternPool.getNav` to determine the `Nav`'s type.
However, this races with the frontend resolving the *value* of that
`Nav`. This involves writing separately to two fields, `bits` and
`type_or_value`. If only one of these changes is observed, then the
backend will incorrectly interpret the type as the value or vice versa,
leading to a crash or even a miscompilation. (Of course, there's also
the straightforward issue that the racing loads were non-atomic, making
them illegal).
The only good solution to this was to make `Nav` 4 bytes bigger, giving
it separate `type` and `value` fields. In theory that's a quite small
change, but it ended up having a bunch of nice consequences which led to
this diff being a bit bulkier than expected:
* `Nav.Repr.Bits` was simplified, because it no longer has to track
"resolution status": we can use `.none` for that. This frees up some
bits to make things more consistent between the "type resolved" and
"fully resolved" states.
* This consistency allowed the `Nav.status` union to be replaced with a
simpler field `Nav.resolved`, which is a bit nicer to work with.
* Most of the "getter" functions were able to be removed from `Nav`
because the state they were fetching had been moved to simple fields
on `Nav.resolved`.
* There were still a handful of free bits in `Nav.Repr.Bits`, which
could be used to represent the "const" and "threadlocal" flags rather
than these being stored on `Key.Extern` and `Key.Variable`. This is a
bit more convenient for linkers.
* With those bits gone, `Key.Variable` is a trivial wrapper around a
type and an initial value, and the fact that a declaration is mutable
can be represented solely through the "const" flag. Therefore,
`Key.Variable` no longer served a purpose, and could be eliminated
entirely in favour of storing the variable's initial value directly in
the "value" field of the `Nav`.
So, I'm quite pleased with this refactor! But anyway, regarding the bug
fix which actually motivated this: if I've done my job correctly, this
should solve some crashes, such as these (which were what tipped me off
to this bug in the first place):
https://codeberg.org/ziglang/zig/actions/runs/2306/jobs/7/attempt/1https://codeberg.org/ziglang/zig/actions/runs/2173/jobs/6/attempt/1
...and, who knows, perhaps even the random SIGSEGVs we've seen on some
targets! Probably not, but one can hope.
My changes to how incremental compilation handles container types mean
that, at least for now, it is possible for the ZIR `.main_struct_inst`
of a source file to be lost (this happens if the number of top-level
fields in a file changes for instance). I missed a few things which
needed changing to account for this, which could lead to crashes with
certain (trivial) changes---oops!
Adds two new incremental test cases. They are currently disabled for
wasm32-wasi-selfhosted because they both trigger a crash in the WASM
backend.
I was trying out combining struct layout resolution with resolution of
default field values, but it broke a few cases which it's not clear we
want to break. The simplest such case was a struct with a field which
was a slice of itself, with a default value of `&.{}`.
So, at least for now, I'm accepting defeat and splitting this back out.
This allows a couple of behavior tests which were removed to be
re-introduced---I will do that in the commit following this one.
I have *not* made this separate phase of resolution "lazy": instead, it
is tied to layout resolution, in the sense that if a struct's layout is
referenced, then its default field values are also referenced. I chose
this approach for simplicity---not of the implementation (it's actually
slightly *more* code to do it this way!), but in terms of the language
specification. I think this behavior is easier to understand and keep in
your head. It can be easily changed in future if we decide we want to.
This partially reverts the commit titled "compiler: merge struct default
value resolution into layout resolution".
The goal of these changes is to allow the C backend to support the new
lazier type resolution system implemented by the frontend. This required
a full rewrite of the `CType` abstraction, and major changes to the C
backend "linker".
The `DebugConstPool` abstraction introduced in a previous commit turns
out to be useful for the C backend to codegen types. Because this use
case is not debug information but rather general linking (albeit when
targeting an unusual object format), I have renamed the abstraction to
`ConstPool`. With it, the C linker is told when a type's layout becomes
known, and can at that point generate the corresponding C definitions,
rather than deferring this work until `flush`.
The work done in `flush` is now more-or-less *solely* focused on
collecting all of the buffers into a big array for a vectored write.
This does unfortunately involve a non-trivial graph traversal to emit
type definitions in an appropriate order, but it's still quite fast in
practice, and it operates on fairly compact dependency data. We don't
generate the actual type *definitions* in `flush`; that happens during
compilation using `ConstPool` as discussed above. (We do generate the
typedefs for underaligned types in `flush`, but that's a trivial amount
of work in most cases.)
`CType` is now an ephemeral type: it is created only when we render a
type (the logic for which has been pushed into just 2 or 3 functions in
`codegen.c`---most of the backend now operates on unmolested Zig `Type`s
instead). C types are no longer stored in a "pool", although the type
"dependencies" of generated C code (that is, the struct, unions, and
typedefs which the generated code references) are tracked (in some
simple hash sets) and given to the linker so it can codegen the types.
This actually doesn't cause any dependency loops in std, which is pretty
much my benchmark for it being acceptable. This can be reverted if it
turns out to be problematic, but for now, let's err on the side of
language simplicity.
To be clear, this *does* regress some cases which previously worked: I
will have to remove some behavior tests as a result of this commit. To
be honest, the tests which look to be failing as a result of this are
things which I think are generally unadvisable; I actually reckon a bit
more friction to use default field values in non-trivial ways might be a
good thing to stop people from misusing them as much. Struct fields
should very rarely have default values; about the only common situation
where they make sense is "options" structs.
Now that https://github.com/ziglang/zig/issues/24657 has been
implemented, the compiler can simplify its internal representation of
comptime-known `packed struct` and `packed union` values. Instead of
storing them field-wise, we can simply store their backing integer
value. This simplifies many operations and improves efficiency in some
cases.
...and rework some of the incremental reference tracking. Almost all
kinds of AnalUnit have one property in common: they might never be
referenced in any update despite conceptually "existing", in which case
we don't want to waste time semantically analyzing them. As of the lazy
type resolution introduced in this commit, the only units to which this
does not apply are `memoized_state` and `@"comptime"`. Previously, I had
a somewhat hacky system in `Zcu` for dealing with this, but I now have a
better understanding of the design incremental compilation is converging
on, so can implement a better solution. By finding a few unused bits
lying around (...or making them), we can represent a single bit of state
indicating whether something's corresponding units have ever been
referenced. This is akin to the units being in `Zcu.outdated`, with the
key difference being that the compiler will *not* attempt to analyze
units which are in this state. Once they are first referenced or
depended on, the flag is set to true and the unit is added to `outdated`
so that it can participate in the normal dependency resolution logic.
use the application's Io implementation where possible. This correctly
makes writing to stderr cancelable, fallible, and participate in the
application's event loop. It also removes one more hard-coded
dependency on a secondary Io implementation.
Eliminate the `std.Thread.Pool` used in the compiler for concurrency and
asynchrony, in favour of the new `std.Io.async` and `std.Io.concurrent`
primitives.
This removes the last usage of `std.Thread.Pool` in the Zig repository.
The new builtins are:
* `@EnumLiteral`
* `@Int`
* `@Fn`
* `@Pointer`
* `@Tuple`
* `@Enum`
* `@Union`
* `@Struct`
Their usage is documented in the language reference.
There is no `@Array` because arrays can be created like this:
if (sentinel) |s| [n:s]T else [n]T
There is also no `@Float`. Instead, `std.meta.Float` can serve this use
case if necessary.
There is no `@ErrorSet` and intentionally no way to achieve this.
Likewise, there is intentionally no way to reify tuples with comptime
fields, or function types with comptime parameters. These decisions
simplify the Zig language specification, and moreover make Zig code more
readable by discouraging overly complex metaprogramming.
Co-authored-by: Ali Cheraghi <alichraghi@proton.me>
Resolves: #10710
I started this diff trying to remove a little dead code from the C
backend, but ended up finding a bunch of dead code sprinkled all over
the place:
* `packed` handling in the C backend which was made dead by `Legalize`
* Representation of pointers to runtime-known vector indices
* Handling for the `vector_store_elem` AIR instruction (now removed)
* Old tuple handling from when they used the InternPool repr of structs
* Straightforward unused functions
* TODOs in the LLVM backend for features which Zig just does not support
This change removes the ref_start_index from the possible enum values of
Index and OptionalIndex. It is not really a index, but a constant that
tells the offset of static Refs, so lets move it where such constant
belongs i.e. to the Ref.
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
This commit expands on the foundations laid by https://github.com/ziglang/zig/pull/23177
and moves even more `Sema`-only functionality from `Value`
to `Sema.arith`. Specifically all shift and bitwise operations,
`@truncate`, `@bitReverse` and `@byteSwap` have been moved and
adapted to the new rules around `undefined`.
Especially the comptime shift operations have been basically
rewritten, fixing many open issues in the process.
New rules applied to operators:
* `<<`, `@shlExact`, `@shlWithOverflow`, `>>`, `@shrExact`: compile error if any operand is undef
* `<<|`, `~`, `^`, `@truncate`, `@bitReverse`, `@byteSwap`: return undef if any operand is undef
* `&`, `|`: Return undef if both operands are undef, turn undef into actual `0xAA` bytes otherwise
Additionally this commit canonicalizes the representation of
aggregates with all-undefined members in the `InternPool` by
disallowing them and enforcing the usage of a single typed
`undef` value instead. This reduces the amount of edge cases
and fixes a bunch of bugs related to partially undefined vecs.
List of operations directly affected by this patch:
* `<<`, `<<|`, `@shlExact`, `@shlWithOverflow`
* `>>`, `@shrExact`
* `&`, `|`, `~`, `^` and their atomic rmw + reduce pendants
* `@truncate`, `@bitReverse`, `@byteSwap`
Basically everything that has a direct replacement or no uses left.
Notable omissions:
- std.ArrayHashMap: Too much fallout, needs a separate cleanup.
- std.debug.runtime_safety: Too much fallout.
- std.heap.GeneralPurposeAllocator: Lots of references to it remain, not
a simple find and replace as "debug allocator" is not equivalent to
"general purpose allocator".
- std.io.Reader: Is being reworked at the moment.
- std.unicode.utf8Decode(): No replacement, needs a new API first.
- Manifest backwards compat options: Removal would break test data used
by TestFetchBuilder.
- panic handler needs to be a namespace: Many tests still rely on it
being a function, needs a separate cleanup.
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API
make std.testing.expectFmt work at compile-time
std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.
Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
- anytype -> *std.io.Writer
- inferred error set -> error{WriteFailed}
- options -> (deleted)
* std.fmt.Formatted
- now takes context type explicitly
- no fmt string
preparing to rearrange std.io namespace into an interface
how to upgrade:
std.io.getStdIn() -> std.fs.File.stdin()
std.io.getStdOut() -> std.fs.File.stdout()
std.io.getStdErr() -> std.fs.File.stderr()