Maintaining the POSIX `stat` bits for Zig is a pain. The order and
bit-length of members differ between all architectures, and int types
can be signed or unsigned. The libcs deal with this by introducing the
own version of `struct stat` and copying the kernel structure members to
it. In the case of glibc, they did it twice thanks to the largefile
transition!
In practice, the project needs to maintain three versions of `struct
stat`:
- What the kernel defines.
- What musl wants for `struct stat`.
- What glibc wants for `struct stat64`. Make sure to use `fstatat64`!
This isn't as simple as running `zig translate-c`. In #21440 I had to:
- Compile toolchains for each arch+glibc/musl combo.
- Create a test `fstat` program with/without `FILE_OFFSET_BITS=64`.
- Dump the value for `struct stat`.
- Stare at `std.os.linux`/`std.c` and cry.
- Add some missing padding.
The fact that so many target checks in the `linux` and `posix` tests
exist is most likely due to writing to padding bits and failing later.
The solution to this madness is `statx(2)`:
- It takes a single structure that is the same for all arches AND libcs.
- It uses a custom timestamp format, but it is 64-bit ready.
- It gives the same info as `fstatat(2)` and more!
- Unlike `fstatat(2)`, you can request a subset of the info required
based on passing a mask.
It's so good that modern Linux arches (e.g. riscv) don't even implement
`stat`, with the libcs using a generic `struct stat` and copying from
`struct statx`.
Therefore, this commit rips out all the `stat` bits from `std.os.linux`
and `std.c`. `std.posix.Stat` is now `void`, and calling
`std.posix.*stat` is an compile-time error. A wrapper around `statx` has
been added to `std.os.linux`, and callers have been upgraded to use it.
Tests have also been updated to use `statx` where possible.
While I was here, I converted the mask and file attributes to be packed
struct bitfields. A nice side effect is checking that you actually
recieved the members you asked for via `Statx.mask`, which I have used
by adding `assert`s at specific callsites.
ring.cmd_sock is generic socket operation. Two most common uses are
setsockopt and getsockopt. This provides same interface as posix
versions of this methods.
libring has also [sqe_set_flags](https://man7.org/linux/man-pages/man3/io_uring_sqe_set_flags.3.html)
method. Adding that in our io_uring_sqe. Adding sqe.link_next method for setting most common flag.
* io_uring: ring mapped buffers
Ring mapped buffers are newer implementation of ring provided buffers, supported
since kernel 5.19. Best described in Jens Axboe [post](https://github.com/axboe/liburing/wiki/io_uring-and-networking-in-2023#provided-buffers)
This commit implements low level io_uring_*_buf_ring_* functions as mostly
direct translation from liburing. It also adds BufferGroup abstraction over those
low level functions.
* io_uring: add multishot recv to BufferGroup
Once we have ring mapped provided buffers functionality it is possible to use
multishot recv operation. Multishot receive is submitted once, and completions
are posted whenever data arrives on the socket. Received data are placed in a
new buffer from buffer group.
Reference: [io_uring and networking in 2023](https://github.com/axboe/liburing/wiki/io_uring-and-networking-in-2023#multi-shot)
Getting NOENT for cancel completion result, meaning:
-ENOENT
The request identified by user_data could not be located.
This could be because it completed before the cancelation
request was issued, or if an invalid identifier is used.
https://man7.org/linux/man-pages/man3/io_uring_prep_cancel.3.htmlhttps://github.com/ziglang/zig/actions/runs/6801394000/job/18492139893?pr=17806
Result in cancel/recv cqes are different depending on the kernel.
on older kernel (tested with v6.0.16, v6.1.57, v6.2.12, v6.4.16)
cqe_cancel.err() == .NOENT
cqe_crecv.err() == .NOBUFS
on kernel (tested with v6.5.0, v6.5.7)
cqe_cancel.err() == .SUCCESS
cqe_crecv.err() == .CANCELED
* `linux.IO_Uring` -> `linux.IoUring` to align with naming conventions.
* All functions `io_uring_prep_foo` are now methods `prep_foo` on `io_uring_sqe`, which is in a file of its own.
* `SubmissionQueue` and `CompletionQueue` are namespaced under `IoUring`.
This is a breaking change.
The new file and namespace layouts are more idiomatic, and allow us to
eliminate one more usage of `usingnamespace` from the standard library.
2 remain.