- x86_64: Implement @cVaStart for Win64
- x86_64: Implement @cVaArg for Win64
- x86_64: Duplicate floating point register args equivalent integer registers when calling variadic functions on Win64
- tests: Enable var_args tests for the self-hosted backend on windows
- tests: Add var_args test for floating point arguments
This generates zig ASTs from `testing.Smith` and is based off the
langref's PEG.
The choice to not build the Ast while generating and instead parsing it
afterwards makes the smith more versatile by not being tied to a single
implementation at a cost of efficiency.
Additionally, a new function `boolWeighted` was added to `Smith` due to
its frequent use in `AstSmith`.
- plain old data ftw
- 177K -> 151K
- data bypasses Sema
This change is not really important but it was nice to explore best
practices for data like this.
When I measured building the compiler, I found no statistically
significant difference in compilation time.
-- On the standard library side:
The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.
`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.
`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
of values.
-- On the libfuzzer and abi side:
--- Uids
These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:
1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.
2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.
Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.
--- Mutations
At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.
Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.
For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.
--- Passive Minimization
A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.
The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.
Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.
-- Comparison to byte-based smith
A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).
-- Test updates
All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
use the application's Io implementation where possible. This correctly
makes writing to stderr cancelable, fallible, and participate in the
application's event loop. It also removes one more hard-coded
dependency on a secondary Io implementation.
instead, allow the user to set it as a field.
this fixes a bug where leak printing and error printing would run tty
config detection for stderr, and then emit a log, which is not necessary
going to print to stderr.
however, the nice defaults are gone; the user must explicitly assign the
tty_config field during initialization or else the logging will not have
color.
related: https://github.com/ziglang/zig/issues/24510
The new builtins are:
* `@EnumLiteral`
* `@Int`
* `@Fn`
* `@Pointer`
* `@Tuple`
* `@Enum`
* `@Union`
* `@Struct`
Their usage is documented in the language reference.
There is no `@Array` because arrays can be created like this:
if (sentinel) |s| [n:s]T else [n]T
There is also no `@Float`. Instead, `std.meta.Float` can serve this use
case if necessary.
There is no `@ErrorSet` and intentionally no way to achieve this.
Likewise, there is intentionally no way to reify tuples with comptime
fields, or function types with comptime parameters. These decisions
simplify the Zig language specification, and moreover make Zig code more
readable by discouraging overly complex metaprogramming.
Co-authored-by: Ali Cheraghi <alichraghi@proton.me>
Resolves: #10710
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
As with Solaris (dba1bf9353), we have no way to
actually audit contributions for these OSs. IBM also makes it even harder than
Oracle to actually obtain these OSs.
closes#23695closes#23694closes#3655closes#23693
The new `--error-style` option decides how build failures are printed.
The default mode "verbose" prints all context including the step graph
fragment and the failed command (if any). The alternative mode "minimal"
prints only the failed step itself, and does not print the failed
command. There are also "verbose_clear" and "minimal_clear" modes, which
have the distinction that the output is cleared (through ANSI escape
codes) between updates, preventing different updates from being confused
in the output. If `--error-style` is not specified, the environment
variable `ZIG_BUILD_ERROR_STYLE` is checked before falling back to the
default of "verbose"; this means the value can effectively be chosen
system-wide since it is generally a personal preference.
Also introduced is a `--multiline-errors` option which decides how to
print errors which span multiple lines. By default, non-initial lines
are indented to align with the first. Alternatively, a leading newline
can be printed to align everyting on the first column, or no special
treatment can be applied, resulting in misaligned output. Again, there
is an environment variable (`ZIG_BUILD_MULTILINE_ERRORS`) to specify a
preferred default if the option is not explicitly provided.
Resolves: #23472
* std.Io.Reader: appendRemaining no longer supports alignment and has
different rules about how exceeding limit. Fixed bug where it would
return success instead of error.StreamTooLong like it was supposed to.
* std.Io.Reader: simplify appendRemaining and appendRemainingUnlimited
to be implemented based on std.Io.Writer.Allocating
* std.Io.Writer: introduce unreachableRebase
* std.Io.Writer: remove minimum_unused_capacity from Allocating. maybe
that flexibility could have been handy, but let's see if anyone
actually needs it. The field is redundant with the superlinear growth
of ArrayList capacity.
* std.Io.Writer: growingRebase also ensures total capacity on the
preserve parameter, making it no longer necessary to do
ensureTotalCapacity at the usage site of decompression streams.
* std.compress.flate.Decompress: fix rebase not taking into account seek
* std.compress.zstd.Decompress: split into "direct" and "indirect" usage
patterns depending on whether a buffer is provided to init, matching
how flate works. Remove some overzealous asserts that prevented buffer
expansion from within rebase implementation.
* std.zig: fix readSourceFileToAlloc returning an overaligned slice
which was difficult to free correctly.
fixes#24608
Changes fmtId to return the FormatId type directly, and renames the
FormatId.render function to FormatId.format, so it can be used in a
format expression directly.
Why? Since `render` is private, you can't create functions that wrap
`fmtId` or `fmtIdFlags`, since you can't name the return type of those
functions outside of std itself.
The current setup _might_ be intentional? In which case I can live with
it, but I figured I'd make a small contrib to upstream zig :)