Mirrors/zig - zig - Gitea @ Femelysm.ru

mirror of https://codeberg.org/ziglang/zig.git synced 2026-05-03 16:22:55 +03:00

Author	SHA1	Message	Date
Kendall Condon	d8ba173e5e	multiprocess fuzzing - New Features -- Multiprocess Fuzzing The fuzzer now is able to utilize multiple cores. This is controllable with the `-j` build option. Limited fuzzing still uses one core. -- Fuzzing Infinite Mode When provided multiple tests, the fuzzer now switches between them and prioritizes the most effective and interesting ones. Over time already explored tests will become barely run compared to tests yielding new inputs. -- Crash Dumps Crashing inputs are now saved to a file indicated by the crash message. It is recommended to use these files to reproduce the crash using `std.testing.FuzzInputOptions.corpus` and @embedFile. - Design Each fuzzing process is assigned an instance id which has the following uses: * In conjunction with the pc hash and running test index, they uniquely identify input files in the case of a crash. * It is combined with the test seed for a unique rng seed. * Instance 0 is solely responsible for syncing the filesystem corpus. When new inputs are found, they are sent to the build server. It then distributes the new input to the other instances. Each instance has a concurrent poller managed by the test runner which sends received inputs to libfuzzer. (note that this is affected by #31718 and so can (rarely) deadlock) For fuzzing infinite mode, the test runner now receives a list of tests from the build server. The fuzzer runs tests in batches of one second, approximated in cycles by the previous batch's run speed. Tests finding new inputs or with few runs are given a higher run chance. The baseline run chance is based off the recency of the last find and the number of pcs the test has hit.	2026-04-03 12:27:34 +02:00
Kendall Condon	6e5a95bd7c	implement proper deflate flush semantics To end a flate stream, `finish` must now be called. `flush` now follows regular semantics and byte-aligns the stream. Byte-aligning the stream is done with empty fixed or store blocks. To implement flush, a variable history length was added and it is tracked if the final bytes of history have been hashed yet.	2026-03-11 02:28:19 +01:00
Kendall Condon	5d58306162	rework fuzz testing to be smith based -- On the standard library side: The `input: []const u8` parameter of functions passed to `testing.fuzz` has changed to `smith: testing.Smith`. `Smith` is used to generate values from libfuzzer or input bytes generated by libfuzzer. `Smith` contains the following base methods: `value` as a generic method for generating any type * `eos` for generating end-of-stream markers. Provides the additional guarantee `true` will eventually by provided. * `bytes` for filling a byte array. * `slice` for filling part of a buffer and providing the length. `Smith.Weight` is used for giving value ranges a higher probability of being selected. By default, every value has a weight of zero (i.e. they will not be selected). Weights can only apply to values that fit within a u64. The above functions have corresponding ones that accept weights. Additionally, the following functions are provided: * `baselineWeights` which provides a set of weights containing every possible value of a type. * `eosSimpleWeighted` for unique weights for `true` and `false` * `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range of values. -- On the libfuzzer and abi side: --- Uids These are u32s which are used to classify requested values. This solves the problem of a mutation causing a new value to be requested and shifting all future values; for example: 1. An initial input contains the values 1, 2, 3 which are interpreted as a, b, and c respectively by the test. 2. The 1 is mutated to a 4 which causes the test to request an extra value interpreted as d. The input is now 4, 2, 3, 5 (new value) which the test corresponds to a, d, b, c; however, b and c no longer correspond to their original values. Uids contain a hash component and type component. The hash component is currently determined in `Smith` by taking a hash of the calling `@returnAddress()` or via an argument in the corresponding `WithHash` functions. The type component is used extensively in libfuzzer with its hashmaps. --- Mutations At the start of a cycle (a run), a random number of values to mutate is selected with less being exponentially more likely. The indexes of the values are selected from a selected uid with a logarithmic bias to uids with more values. Mutations may change a single values, several consecutive values in a uid, or several consecutive values in the uid-independent order they were requested. They may generate random values, mutate from previous ones, or copy from other values in the same uid from the same input or spliced from another. For integers, mutations from previous ones currently only generates random values. For bytes, mutations from previous mix new random data and previous bytes with a set number of mutations. --- Passive Minimization A different approach has been taken for minimizing inputs: instead of trying a fixed set of mutations when a fresh input is found, the input is instead simply added to the corpus and removed when it is no longer valuable. The quality of an input is measured based off how many unique pcs it hit and how many values it needed from the fuzzer. It is tracked which inputs hold the best qualities for each pc for hitting the minimum and maximum unique pcs while needing the least values. Once all an input's qualities have been superseded for the pcs it hit, it is removed from the corpus. -- Comparison to byte-based smith A byte-based smith would be much more inefficient and complex than this solution. It would be unable to solve the shifting problem that Uids do. It is unable to provide values from the fuzzer past end-of-stream. Even with feedback, it would be unable to act on dynamic weights which have proven essential with the updated tests (e.g. to constrain values to a range). -- Test updates All the standard library tests have been updated to use the new smith interface. For `Deque`, an ad hoc allocator was written to improve performance and remove reliance on heap allocation. `TokenSmith` has been added to aid in testing Ast and help inform decisions on the smith interface.	2026-02-13 22:12:19 -05:00
Andrew Kelley	ee21a1f988	fetch: implement recompression After fetching a package and applying the filter by deleting files that are not part of the hash, creates a recompressed $GLOBAL_CACHE/p/$PKG_HASH.tar.gz Checking this cache before fetching network URLs is not yet implemented.	2026-02-05 16:50:41 -08:00
Adrià Arrufat	02c5f05e2f	std: replace usages of std.mem.indexOf with std.mem.find	2025-12-05 14:31:27 +01:00
Kendall Condon	8284da2f3d	flate.Compress: simplify huffman node comparisons Instead of comparing each field, nodes are now compared as 32-bit values where `freq` is in the most significant bits.	2025-11-22 22:11:33 -08:00
Kendall Condon	f50c647977	add deflate compression, simplify decompression Implements deflate compression from scratch. A history window is kept in the writer's buffer for matching and a chained hash table is used to find matches. Tokens are accumulated until a threshold is reached and then outputted as a block. Flush is used to indicate end of stream. Additionally, two other deflate writers are provided: * `Raw` writes only in store blocks (the uncompressed bytes). It utilizes data vectors to efficiently send block headers and data. * `Huffman` only performs Huffman compression on data and no matching. The above are also able to take advantage of writer semantics since they do not need to keep a history. Literal and distance code parameters in `token` have also been reworked. Their parameters are now derived mathematically, however the more expensive ones are still obtained through a lookup table (expect on ReleaseSmall). Decompression bit reading has been greatly simplified, taking advantage of the ability to peek on the underlying reader. Additionally, a few bugs with limit handling have been fixed.	2025-09-30 18:28:47 -07:00
Andrew Kelley	824c157e0c	std.compress.flate: finish reorganizing	2025-07-31 22:10:11 -07:00
Andrew Kelley	a4f05a4588	delete flate implementation	2025-07-31 22:10:11 -07:00
Andrew Kelley	83513ade35	std.compress: rework flate to new I/O API	2025-07-31 22:10:11 -07:00

10 Commits