The previous approach had a few downsides:
1. You couldn't set the alignment of the internal buffer. Many callers in
the standard library trying to use this for a small vec style
optimization worked around this by setting the alignment for
the struct itself, this ends up very verbose and also assumes a
specific layout for the struct which isn't guaranteed.
2. It was generic over the size of the buffer. This type is used a lot in
std with various sizes.
3. It has an awkward API where you had to call get which mutated the type
unlike all other allocators, and then had a runtime check to make sure
you didn't get this wrong.
The new approach resolves all of these issues by just taking the buf as
an argument.
This is particularly amenable to smallvec style optimizations: you can
just declare the buf as an array of the item you want to allocate to get
the exact minimum size.
Modifies the `Allocator` implementation provided by `ArenaAllocator` to be
threadsafe using only atomics and no synchronization primitives locked
behind an `Io` implementation.
At its core this is a lock-free singly linked list which uses CAS loops to
exchange the head node. A nice property of `ArenaAllocator` is that the
only functions that can ever remove nodes from its linked list are `reset`
and `deinit`, both of which are not part of the `Allocator` interface and
thus aren't threadsafe, so node-related ABA problems are impossible.
There *are* some trade-offs: end index tracking is now per node instead of
per allocator instance. It's not possible to publish a head node and its
end index at the same time if the latter isn't part of the former.
Another compromise had to be made in regards to resizing existing nodes.
Annoyingly, `rawResize` of an arbitrary thread-safe child allocator can
of course never be guaranteed to be an atomic operation, so only one
`alloc` call can ever resize at the same time, other threads have to
consider any resizes they attempt during that time failed. This causes
slightly less optimal behavior than what could be achieved with a mutex.
The LSB of `Node.size` is used to signal that a node is being resized.
This means that all nodes have to have an even size.
Calls to `alloc` have to allocate new nodes optimistically as they can
only know whether any CAS on a head node will succeed after attempting it,
and to attempt the CAS they of course already need to know the address of
the freshly allocated node they are trying to make the new head.
The simplest solution to this would be to just free the new node again if
a CAS fails, however this can be expensive and would mean that in practice
arenas could only really be used with a GPA as their child allocator. To
work around this, this implementation keeps its own free list of nodes
which didn't make their CAS to be reused by a later `alloc` invocation.
To keep things simple and avoid ABA problems the free list is only ever
be accessed beyond its head by 'stealing' the head node (and thus the
entire list) with an atomic swap. This makes iteration and removal trivial
since there's only ever one thread doing it at a time which also owns all
nodes it's holding. When the thread is done it can just push its list onto
the free list again.
This implementation offers comparable performance to the previous one when
only being accessed by a single thread and a slight speedup compared to
the previous implementation wrapped into a `ThreadSafeAllocator` up to ~7
threads performing operations on it concurrently.
(measured on a base model MacBook Pro M1)
The old logic was fine for targets where the stack grows up (so, literally just
hppa), but problematic on targets where it grows down, because we could hint
that we wanted an allocation to happen in an area of the address space that the
kernel expects to be able to expand the stack into. The kernel is happy to
satisfy such a hint despite the obvious problems this leads to later down the
road.
Co-authored-by: rpkak <rpkak@noreply.codeberg.org>
This reverts commit c9fa8e46df.
This commit was failing CI checks. This failure was unfortunately not
noticed before merge, due in part to the build runner bug fixed in the
last commit.
When linking libc, it should be the libc that manages the heap. The main
Wasm memory might have been configured as non-growable, which makes
`WasmAllocator` a poor default and causes the common `DebugAllocator`
use case fail with OOM errors unless the user uses `std_options` to
override the default page allocator. Additionally, on Emscripten,
growing Wasm memory without notifying the JS glue code will cause array
buffers to get detached and lead to spurious crashes.
After https://codeberg.org/ziglang/zig/pulls/30103, `raw_c_allocator` is
redundant. It existed to avoid overhead when you could assert that all
of your `Allocator` usage was going to be compatible with the C `malloc`
API, but the standard `c_allocator` is now able to avoid that overhead
*in the case* that your usage is compatible (and use the less efficient
path in the rare case where it's not), so there's no need for the raw
version anymore. Leaving it in `std.heap` at this point seems like it
would just be a footgun.
The main goal here was to avoid allocating padding and header space if
`malloc` already guarantees the alignment we need via `max_align_t`.
Previously, the compiler was using `std.heap.raw_c_allocator` as its GPA
in some cases depending on `std.c.max_align_t`, but that's pretty
fragile (it meant we had to encode our alignment requirements into
`src/main.zig`!). Perhaps more importantly, that solution is
unnecessarily restrictive: since Zig's `Allocator` API passes the
`Alignment` not only to `alloc`, but also to `free` etc, we are able to
use a different strategy depending on its value. So `c_allocator` can
simply compare the requested align to `Alignment.of(std.c.max_align_t)`,
and use a raw `malloc` call (no header needed!) if it will guarantee a
suitable alignment (which, in practice, will be true the vast majority
of the time).
So in short, this makes `std.heap.c_allocator` more memory efficient,
and probably removes any incentive to use `std.heap.raw_c_allocator`.
I also refactored the `c_allocator` implementation while doing this,
just to neaten things up a little.
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.
While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
There is no straightforward way for the Zig team to access the Solaris system
headers; to do this, one has to create an Oracle account, accept their EULA to
download the installer ISO, and finally install it on a machine or VM. We do not
have to jump through hoops like this for any other OS that we support, and no
one on the team has expressed willingness to do it.
As a result, we cannot audit any Solaris contributions to std.c or other
similarly sensitive parts of the standard library. The best we would be able to
do is assume that Solaris and illumos are 100% compatible with no way to verify
that assumption. But at that point, the solaris and illumos OS tags would be
functionally identical anyway.
For Solaris especially, any contributions that involve APIs introduced after the
OS was made closed-source would also be inherently more risky than equivalent
contributions for other proprietary OSs due to the case of Google LLC v. Oracle
America, Inc., wherein Oracle clearly demonstrated its willingness to pursue
legal action against entities that merely copy API declarations.
Finally, Oracle laid off most of the Solaris team in 2017; the OS has been in
maintenance mode since, presumably to be retired completely sometime in the 2030s.
For these reasons, this commit removes all Oracle Solaris support.
Anyone who still wishes to use Zig on Solaris can try their luck by simply using
illumos instead of solaris in target triples - chances are it'll work. But there
will be no effort from the Zig team to support this use case; we recommend that
people move to illumos instead.
This mainly just moves stuff around.
Justifications for other changes:
* `KEVENT.FLAGS` is backed by `c_uint` because that's what the `kevent64` flags param takes (according to the 'latest' manpage from 2008)
* `MACH_RCV_NOTIFY` is a legacy name and `MACH_RCV_OVERWRITE` is deprecated (xnu/osfmk/mach/message.h), so I removed them. They were 0 anyway and thus couldn't be represented
as a packed struct field.
* `MACH.RCV` and `MACH.SEND` are technically the same 'type' because they can both be supplied at the same time to `mach_msg`. I decided to still keep them separate because
naming works out better that way and all flags except for `MACH_MSG_STRICT_REPLY` aren't shared anyway. Both are part of a packed union `mach_msg_option_t` which supplies a
helper function to combine the two types.
* `PT` is backed by `c_int` because that's what `ptrace` takes as a request arg (according to the latest manpage from 2015)
Functions like isMinGW() and isGnuLibC() have a good reason to exist: They look
at multiple components of the target. But functions like isWasm(), isDarwin(),
isGnu(), etc only exist to save 4-8 characters. I don't think this is a good
enough reason to keep them, especially given that:
* It's not immediately obvious to a reader whether target.isDarwin() means the
same thing as target.os.tag.isDarwin() precisely because isMinGW() and similar
functions *do* look at multiple components.
* It's not clear where we would draw the line. The logical conclusion before
this commit would be to also wrap Arch.isX86(), Os.Tag.isSolarish(),
Abi.isOpenHarmony(), etc... this obviously quickly gets out of hand.
* It's nice to just have a single correct way of doing something.
Windows-only, depends on kernel32 in violation of zig std lib policy,
and redundant with other cross-platform APIs that perform the same
functionality.
I don't think these belong in std, at least not in their current form.
If someone wants to add these back I'd like to review the patch before
it lands.
Reverts 629e2e7844