Maintaining the POSIX `stat` bits for Zig is a pain. The order and
bit-length of members differ between all architectures, and int types
can be signed or unsigned. The libcs deal with this by introducing the
own version of `struct stat` and copying the kernel structure members to
it. In the case of glibc, they did it twice thanks to the largefile
transition!
In practice, the project needs to maintain three versions of `struct
stat`:
- What the kernel defines.
- What musl wants for `struct stat`.
- What glibc wants for `struct stat64`. Make sure to use `fstatat64`!
This isn't as simple as running `zig translate-c`. In #21440 I had to:
- Compile toolchains for each arch+glibc/musl combo.
- Create a test `fstat` program with/without `FILE_OFFSET_BITS=64`.
- Dump the value for `struct stat`.
- Stare at `std.os.linux`/`std.c` and cry.
- Add some missing padding.
The fact that so many target checks in the `linux` and `posix` tests
exist is most likely due to writing to padding bits and failing later.
The solution to this madness is `statx(2)`:
- It takes a single structure that is the same for all arches AND libcs.
- It uses a custom timestamp format, but it is 64-bit ready.
- It gives the same info as `fstatat(2)` and more!
- Unlike `fstatat(2)`, you can request a subset of the info required
based on passing a mask.
It's so good that modern Linux arches (e.g. riscv) don't even implement
`stat`, with the libcs using a generic `struct stat` and copying from
`struct statx`.
Therefore, this commit rips out all the `stat` bits from `std.os.linux`
and `std.c`. `std.posix.Stat` is now `void`, and calling
`std.posix.*stat` is an compile-time error. A wrapper around `statx` has
been added to `std.os.linux`, and callers have been upgraded to use it.
Tests have also been updated to use `statx` where possible.
While I was here, I converted the mask and file attributes to be packed
struct bitfields. A nice side effect is checking that you actually
recieved the members you asked for via `Statx.mask`, which I have used
by adding `assert`s at specific callsites.
The ML-KEM decapsulation was returning z directly when implicit
rejection was triggered, but FIPS 203 specifies it should return
J(z || c) = SHAKE256(z || c).
FreeBSD 15 dropped this target. It also dropped x86-freebsd-none, but since that
target remains buildable and usable with the 32-bit compat layer, we keep it for
now.
After https://codeberg.org/ziglang/zig/pulls/30103, `raw_c_allocator` is
redundant. It existed to avoid overhead when you could assert that all
of your `Allocator` usage was going to be compatible with the C `malloc`
API, but the standard `c_allocator` is now able to avoid that overhead
*in the case* that your usage is compatible (and use the less efficient
path in the rare case where it's not), so there's no need for the raw
version anymore. Leaving it in `std.heap` at this point seems like it
would just be a footgun.
https://github.com/ziglang/zig/issues/26027#issuecomment-3571227050
tracked some bad performance in `DebugAllocator` on macOS down to a
function in dyld which `std.debug.SelfInfo` was calling into. It turns
out `dladdr`'s symbol lookup logic is horrendously slow (looking at its
source code, it appears to be doing a *linear scan* over all symbols in
the image?!). However, we don't actually need the symbol, so we want to
try and avoid this logic.
Luckily, dyld has more precise APIs for what we need! Unluckily, Apple,
in their infinite wisdom, decided they should be deprecated in favour of
`dladdr`, despite the latter being several times slower (and by "several
times", I have measured a 50x slowdown on repeated calls to `dladdr`
compared to the other API). But luckily again, the deprecated APIs are
still exposed.
So, after a careful analysis of the situation (reading dyld code and
cursing Apple engineers), I think it makes sense to just use these
deprecated APIs for now. If they ever go away, we can write our own
cache for this data to bypass Apple's awfully slow code, but I suspect
these functions will stick around for the foreseeable future.
Uh, and if `_dyld_get_image_header_containing_address` goes away,
there's also `dyld_image_header_containing_address`, which is a
seemingly identical function, exported by dyld just the same, but with a
separate (functionally identical) implementation, and not documented in
the public header file. Apple work in mysterious ways, I guess.
The main goal here was to avoid allocating padding and header space if
`malloc` already guarantees the alignment we need via `max_align_t`.
Previously, the compiler was using `std.heap.raw_c_allocator` as its GPA
in some cases depending on `std.c.max_align_t`, but that's pretty
fragile (it meant we had to encode our alignment requirements into
`src/main.zig`!). Perhaps more importantly, that solution is
unnecessarily restrictive: since Zig's `Allocator` API passes the
`Alignment` not only to `alloc`, but also to `free` etc, we are able to
use a different strategy depending on its value. So `c_allocator` can
simply compare the requested align to `Alignment.of(std.c.max_align_t)`,
and use a raw `malloc` call (no header needed!) if it will guarantee a
suitable alignment (which, in practice, will be true the vast majority
of the time).
So in short, this makes `std.heap.c_allocator` more memory efficient,
and probably removes any incentive to use `std.heap.raw_c_allocator`.
I also refactored the `c_allocator` implementation while doing this,
just to neaten things up a little.
If ECANCELED occurs, it's from pthread_cancel which will *permanently*
set that thread to be in a "canceling" state, which means the cancel
cannot be ignored. That means it cannot be retried, like EINTR. It must
be acknowledged.
Unfortunately, trying again until the cancellation request is
acknowledged has been observed to incur a large amount of overhead,
and usually strong cancellation guarantees are not needed, so the
race condition is not handled here. Users who want to avoid this
have this menu of options instead:
* Use no libc, in which case Zig std lib can avoid the race (tracking
issue: https://codeberg.org/ziglang/zig/issues/30049)
* Use musl libc
* Use `std.Io.Evented`. But this is not implemented yet. Tracked by
- https://codeberg.org/ziglang/zig/issues/30050
- https://codeberg.org/ziglang/zig/issues/30051
glibc + threaded is the only problematic combination.
On a heavily loaded Linux 6.17.5, I observed a maximum of 20 attempts
not acknowledged before the timeout (including exponential backoff) was
sufficient, despite the heavy load.
The time wasted here sleeping is mitigated by the fact that, later on,
the system will likely wait for the canceled task, causing it to
indefinitely yield until the canceled task finishes, and the task must
acknowledge the cancel before it proceeds to that point.
Now, before a syscall is entered, beginSyscall is called, which may
return error.Canceled. After syscall returns, whether error or success,
endSyscall is called. If the syscall returns EINTR then checkCancel is
called.
`cancelRequested` is removed from the std.Io VTable for now, with plans
to replace it with a more powerful API that allows protection against
cancellation requests.
closes#25751
The inverse MixColumns operation is already used internally for
AES decryption, but it wasn’t exposed in the public API because
it didn’t seem necessary at the time.
Since then, several new AES-based block ciphers and permutations
(such as Vistrutah and Areion) have been developed, and they require
this operation to be implementable in Zig.
Since then, new interesting AES-based block ciphers and permutations
(Vistrutah, Areion, etc). have been invented, and require that
operation to be implementable in Zig.