My changes to how incremental compilation handles container types mean
that, at least for now, it is possible for the ZIR `.main_struct_inst`
of a source file to be lost (this happens if the number of top-level
fields in a file changes for instance). I missed a few things which
needed changing to account for this, which could lead to crashes with
certain (trivial) changes---oops!
Adds two new incremental test cases. They are currently disabled for
wasm32-wasi-selfhosted because they both trigger a crash in the WASM
backend.
This PR started as addressing the long-standing TODO above `buildOpcode`:
/// TODO: deprecated, should be split up per tag.
The code around this area was written a long time ago and has effectively
become a legacy approach. When I started doing semi-occasional work for
bringing >128-bit integer operations to the wasm backend, which is
the last big missing piece of this backend, this design became annoying.
While thinking about how to support that work, and also how vector unrolling
should be handled (in cases where we do not rely purely on the legalize pass),
I decided to do some architectural changes were needed.
The first step is removing helpers like `intBinOp`, `floatBinOp`, `UnOp`, etc.
They do not really capture all operations and resulted in a lot of small
pieces of code trying to artificially unify different ops.
Instead, the direction taken here is similar to `Sema/arith.zig`, introduce
backend-oriented helpers such as `int*Op*Scalar` and `float*Op*` that operate
purely in backend structures without referencing AIR at all.
Additionally, the idea of introducing dedicated `IntType` and `FloatType`
types was chosen. Using `Type` from Sema inside the backend is awkward,
especially when strange or temporary types are needed. Creating them through
`PerThread` is also undesirable since the backend strives to not modify
`InternPool`. This goal is not fully achieved yet, some parts still require
changes, and `InternPool` type formatting still requires `pt` for error
reporting.
This PR also enables legalize passes for some packed operations. The previous
code in this area was buggy, and given the current state of the backend,
relying on legalization is simpler.
Finally, this PR disables one behavior test: `atomicrmw` with floats. The test
seems to only run the non-concurrency path, because it would crash otherwise,
and since we do not currently run behavior tests for the self-hosted backend
with concurrency or atomics enabled, it does not provide meaningful coverage yet.
In summary, this refactor reworks instruction selection in the wasm backend
to simplify the code and make future work, especially adding big integer
support.
Since packed containers are now represented by `bitpack` and can't include
pointers anymore this has become a very easy change to make. This commit
largely just reuses the logic already in place for integers.
Also fixes a small bug where captures-by-ref of errors wouldn't cause a
compile error for regular switch statements. There was already an astgen
error in place for error handling switch statements (`switch_block_err_union`)
capturing their error by reference.
This builds on the changes in PR #30053 / commit 484cc15366.
Previously, peer type resolution would always result in a conflict for
fixed-width integer and float types. Now that small integer types can
coerce to floats, peer type resolution can take that into account. This
primarily benefits arithmetic with mixed float and integer operands. If
the integer operand can coerce to the float operand's type, it will do
so without requiring an explicit cast. If the integer type can't coerce,
there will be a compiler error; no float widening will occur. Explicit
casting will still be required to make it work.
This is a follow-up to PR #30053 / commit 484cc15366.
The code previously did not handle `c_longdouble`, whose size depends on
the target. A `floatSignificandBits` helper function and a smoke test
were added.
Also added the missing max int value to the exhaustive `f16` test cases.
The bugs here actually exist on master branch too, but they are being
caught by the new static assertions which check type size and alignment.
It turns out that MSVC's struct/union "pack" pragma and its "align"
declspec interact in undocumented ways which are extremely problematic
for generated code. Solving this will require changing how the C backend
lowers various types; the disabled tests are all tagged unions, but
there are also issues with structs with underaligned fields which the
behavior tests just happen to not currently be triggering.
Now that struct default value resolution is separate from struct layout
resolution, a handful of old behavior tests are now once again valid.
This partially reverts the commit titled "behavior: update for changes
to struct field default value resolution".
This is separate from the previous commit so that these changes can be
easily reverted in the event that we decide to allow more granularity in
default value resolution in exchange for increased language complexity.
Pointers to comptime-only types (e.g. `*type`) are no longer themselves
comptime-only types. This means explicit `comptime` annotations are
required in a few more places. However, it also introduces the ability
to access pointers to (including slices of) comptime-only types at
runtime, provided only runtime fields are being accessed.
`@sizeOf` and `@bitSizeOf` are now more restricted: they are not allowed
on comptime-only or NPV (uninstantiable) types. This is because there is
no correct way to actually use the returned ABI size (e.g. you cannot
copy a comptime-only type by copying all of its runtime bits), so having
a non-zero return value had no benefit and was simply confusing.
`packed struct`s and `packed union`s can no longer contain pointer
fields. There are a few reasons for this, but in particular, binary
formats do not typically support the relocation types we would need to
lower such values into static memory. See the proposal at
https://github.com/ziglang/zig/issues/24657 for details.
Unions with no fields are now "uninstantiable" types, which work like
`noreturn` in that values of this type cannot exist. Enums with no
fields are different because they are currently considered `extern`
types, though https://github.com/ziglang/zig/issues/19855 will change
this in the future.
'comptime_int' is no longer considered a valid backing type for an enum.
In other words, 'enum(comptime_int)' is a compile error. This change is
accepted to simplify the language.
On Windows, it is sometimes problematic to depend on ws2_32.dll. Before,
users of std.Io.Threaded would have to call ioBasic() rather than io()
in order to avoid unnecessary dependencies on ws2_32.dll. Now, the
application can disable networking with std.Options.
This change is necessary due to moving networking functionality to
be based on Io.Operation, which is a tagged union.
The logic used to allow providing a path for setting the CWD of a child process in https://codeberg.org/ziglang/zig/pulls/31090 applies here as well:
- Windows must provide a path when setting the CWD, so the path of an `Io.Dir` must be resolved before actually calling RtlSetCurrentDirectory_U
- A directory handle may have multiple paths associated with it, so providing the CWD as a string retains a legitimate use case in cases where the precise path matters
-- On the standard library side:
The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.
`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.
`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
of values.
-- On the libfuzzer and abi side:
--- Uids
These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:
1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.
2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.
Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.
--- Mutations
At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.
Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.
For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.
--- Passive Minimization
A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.
The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.
Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.
-- Comparison to byte-based smith
A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).
-- Test updates
All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
Changes an assert back into a conditional to match the behavior of `getPosix`, see https://codeberg.org/ziglang/zig/pulls/31113#issuecomment-10371698 and https://github.com/ziglang/zig/issues/23331.
Note: the conditional has been updated to also return null early on 0-length key lookups, since there's no need to iterate the block in that case.
For `Environ.Map`, validation of keys has been split into two categories: 'put' and 'fetch', each of which are tailored to the constraints that the implementation actually relies upon. Specifically:
- Hashing (fetching) requires the keys to be valid WTF-8 on Windows, but does not rely on any other properties of the keys (attempting to fetch `F\x00=` is not a problem, it just won't be found)
- `create{Posix,Windows}Block` relies on the Map to always have fully valid keys (no NUL, no `=` in an invalid location, no zero-length keys), which means that the 'put' APIs need to validate that incoming keys adhere to those properties.
The relevant assertions are now documented on each of the Map functions.
Also reinstates some test cases in the `env_vars` standalone test. Some of the reinstated tests are effectively just testing the Environ.Map implementation due to how `Environ.contains`, `Environ.getAlloc`, etc are implemented, but that is not inherent to those functions so the tests are still potentially relevant if e.g. `contains` is implemented in terms of `getPosix`/`getWindows` in the future (which is totally possible and maybe a good idea since constructing the whole map is not necessary for looking up one key).