Commit Graph

490 Commits

Author SHA1 Message Date
Matthew Lugg 97fe49a80f Elf2: rework the symtab, and fix a bunch of stuff
Sorry for the mega-commit, this diff got a little out of control.

The main thing here is a complete rework of how Elf2 handles the symbol
table. I messed around with the design for a while and landed on
something which is fairly memory-efficient (in particular the overhead
for STB_LOCAL symbols is as low as possible) and fulfils some of the
more awkward constraints of the ELF format. The main such constraint is
that all STB_LOCAL symbols in a symbol table are required to appear
before any STB_GLOBAL/STB_WEAK symbols. This is further complicated by
the fact that when producing a DSO, symbols with STV_HIDDEN or
STV_INTERNAL visibility are required to have STB_LOCAL binding in the
symbol table, even though they are global symbols from the perspective
of the link editor. Plus, when combining multiple symbols with the same
name, the resulting visibility is the strictest of all of the inputs, so
it is possible at any point in compilation to discover an extern/export
symbol which forces an existing STB_GLOBAL symbol to become STB_LOCAL
and therefore requires it to move to an earlier symtab index. Dealing
with all of this was quite awkward.

But I got there! I also implemented a lot of features in the process. I
don't remember everything perfectly, but here's a vague list:

* Multiple definitions of and/or unresolved references to symbols are
  now combined correctly in all cases

* `.bss` sections from inputs are correctly lowered (we don't actually
  emit a `.bss` section of our own yet, but I was able to put that data
  into the `.data` section so that the functionality is correct)

* Relocations in link inputs are now always processed (previously they
  would be silently ignored in most cases)

* Linker errors are triggered if a supported input section has a
  relocation which targets an unsupported input section (previously
  the unsupported section's symbol was dropped and associated
  relocations would be silently ignored)

* When linking a static executable, an error is emitted if a required
  symbol (i.e. an undefined reference with strong linkage) was never
  defined

* Duplicate symbol errors now work correctly

* When emitting a relocatable, the offsets of relocation entries are now
  correct (previously the offsets written were relative to a symbol
  rather than a section, meaning that e.g. almost all text relocations
  were just in a single function)

The changes in all of the other linkers and codegen backends are some
added type-safety at the codegen-linker API boundary. There are now
distinct `u32`-backed types for identifying an "atom" (the thing we're
codegenning) and a "symbol" (the thing which a relocation targets).
Linker implementations can use a couple of private helper functions to
convert between this implementation-agnostic type and their specific
type; for instance, `Elf2` can convert between a `Symbol.Id` and a
`link.File.SymbolId` with `Symbol.Id.fromTypeErased` and
`Symbol.Id.toTypeErased`. I didn't implement this nicely for any other
linker, so right now there's a lot of `@enumFromInt`/`@intFromEnum`
sprinkled all over the place, particularly with the legacy ELF and
Mach-O linkers.

I tested that I could still perform incremental updates to the Zig
compiler using this commit. In terms of the new behaviors, the most
interesting stuff is symbol and relocation resolution, so I ran a few
tests involving building a "Hello World" binary in various different
ways:

* `build-exe` correctly succeeds

* `build-exe -fno-compiler-rt` correctly reports undefined symbols

* `build-obj` linked with `build-exe` correctly succeeds

* `build-obj` linked with `build-exe -fno-compiler-rt` correctly reports
  undefined symbols

* `build-obj -fcompiler-rt` linked with `build-exe -fno-compiler-rt`
  correctly succeeds

* `build-obj -fcompiler-rt` linked with `build-exe` correctly succeeds
  (the compiler-rt symbols are weak so the global symbols are
  arbitrarily resolved to one of the two implementations)

I also manually verified with `readelf` that symbol tables were always
ordered correctly (before this PR, `readelf -s` would usually emit
warnings about incorrectly-ordered symtabs!), and verified that various
visibility attributes worked as expected.

No actual test coverage is added due to the current lack of a useful
linker test harness. Once a good test harness is available I will be
willing to write some tests.
2026-05-17 18:55:26 +01:00
Matthew Lugg 4c330e053b compiler: use 'std.lang' instead of 'std.builtin' 2026-05-03 12:23:30 +01:00
Alex Rønne Petersen 0227253677 compiler: link libtsan with -whole-archive
libtsan contains a .preinit_array entry that must run for correctness.
2026-04-29 13:01:33 +02:00
andrew.kraevskii bbab366b78 Audit usages of toOwnedSlice (#32001)
Followup to #30769

I grepped for `try .*toOwnedSlice` and checked all of them by hand.

Fixes a bunch of memory leaks removes usages or `errdefer` and `vars` in some places. I also switched array_list.Managed to ArrayList where it was convenient.

Reviewed-on: https://codeberg.org/ziglang/zig/pulls/32001
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
2026-04-22 19:35:46 +02:00
Matthew Lugg 5d215838a7 InternPool.Nav: fix race, refactor
I've realised that the cause of at least some of our weird CI flakiness
was a bug in how `Nav` values were resolved. Consider this scenario: the
frontend resolves the type of a `Nav`, and then sends a function to the
backend, which requires the backend to lower a pointer to that `Nav`.
The backend calls `InternPool.getNav` to determine the `Nav`'s type.
However, this races with the frontend resolving the *value* of that
`Nav`. This involves writing separately to two fields, `bits` and
`type_or_value`. If only one of these changes is observed, then the
backend will incorrectly interpret the type as the value or vice versa,
leading to a crash or even a miscompilation. (Of course, there's also
the straightforward issue that the racing loads were non-atomic, making
them illegal).

The only good solution to this was to make `Nav` 4 bytes bigger, giving
it separate `type` and `value` fields. In theory that's a quite small
change, but it ended up having a bunch of nice consequences which led to
this diff being a bit bulkier than expected:

* `Nav.Repr.Bits` was simplified, because it no longer has to track
  "resolution status": we can use `.none` for that. This frees up some
  bits to make things more consistent between the "type resolved" and
  "fully resolved" states.

* This consistency allowed the `Nav.status` union to be replaced with a
  simpler field `Nav.resolved`, which is a bit nicer to work with.

* Most of the "getter" functions were able to be removed from `Nav`
  because the state they were fetching had been moved to simple fields
  on `Nav.resolved`.

* There were still a handful of free bits in `Nav.Repr.Bits`, which
  could be used to represent the "const" and "threadlocal" flags rather
  than these being stored on `Key.Extern` and `Key.Variable`. This is a
  bit more convenient for linkers.

* With those bits gone, `Key.Variable` is a trivial wrapper around a
  type and an initial value, and the fact that a declaration is mutable
  can be represented solely through the "const" flag. Therefore,
  `Key.Variable` no longer served a purpose, and could be eliminated
  entirely in favour of storing the variable's initial value directly in
  the "value" field of the `Nav`.

So, I'm quite pleased with this refactor! But anyway, regarding the bug
fix which actually motivated this: if I've done my job correctly, this
should solve some crashes, such as these (which were what tipped me off
to this bug in the first place):

https://codeberg.org/ziglang/zig/actions/runs/2306/jobs/7/attempt/1
https://codeberg.org/ziglang/zig/actions/runs/2173/jobs/6/attempt/1

...and, who knows, perhaps even the random SIGSEGVs we've seen on some
targets! Probably not, but one can hope.
2026-03-15 11:47:14 +00:00
Kendall Condon 02e8339ca7 zig build fmt 2026-03-12 17:44:03 -04:00
Matthew Lugg c2b42383eb compiler,std: various little fixes 2026-03-10 10:26:14 +00:00
Matthew Lugg 5cc12da1c0 cbe: rework CType and other major refactors
The goal of these changes is to allow the C backend to support the new
lazier type resolution system implemented by the frontend. This required
a full rewrite of the `CType` abstraction, and major changes to the C
backend "linker".

The `DebugConstPool` abstraction introduced in a previous commit turns
out to be useful for the C backend to codegen types. Because this use
case is not debug information but rather general linking (albeit when
targeting an unusual object format), I have renamed the abstraction to
`ConstPool`. With it, the C linker is told when a type's layout becomes
known, and can at that point generate the corresponding C definitions,
rather than deferring this work until `flush`.

The work done in `flush` is now more-or-less *solely* focused on
collecting all of the buffers into a big array for a vectored write.
This does unfortunately involve a non-trivial graph traversal to emit
type definitions in an appropriate order, but it's still quite fast in
practice, and it operates on fairly compact dependency data. We don't
generate the actual type *definitions* in `flush`; that happens during
compilation using `ConstPool` as discussed above. (We do generate the
typedefs for underaligned types in `flush`, but that's a trivial amount
of work in most cases.)

`CType` is now an ephemeral type: it is created only when we render a
type (the logic for which has been pushed into just 2 or 3 functions in
`codegen.c`---most of the backend now operates on unmolested Zig `Type`s
instead). C types are no longer stored in a "pool", although the type
"dependencies" of generated C code (that is, the struct, unions, and
typedefs which the generated code references) are tracked (in some
simple hash sets) and given to the linker so it can codegen the types.
2026-03-10 10:26:12 +00:00
Matthew Lugg f7a1ccfc56 compiler: fix up LLVM backend, and improve its debug info
The LLVM backend can now run the behavior tests and standard library
tests, like the x86_64 backend can. This commit required me to make a
lot of changes to how the LLVM backend lowers debug information, and
while I was doing that, I improved a few things:

* `anyerror` is now an enum type (and other error sets just wrap it), so
  error values appear by name in debuggers

* Fixed broken lowering for tagged unions with zero-width payloads

* Associate container types with source locations in all cases

* Avoid depending on the order of type resolution (using the new
  `DebugConstPool` abstraction), so debug information will contain all
  available type information rather than just the subset which happens
  to be resolved when the backend lowers that debug type
2026-03-10 10:26:12 +00:00
Matthew Lugg bcb1a6bdf3 compiler: make Dwarf and self-hosted x86_64 happy
Introduces a small abstraction, `link.DebugConstPool`, to deal with
lowering type/value information into debug info when it may not be known
until type resolution (which in some cases will *never* happen). It is
currently only used by self-hosted DWARF logic, but it will also be of
use to the LLVM backend (which is my next focus).
2026-03-10 10:26:11 +00:00
Marcel W. Wysocki 0e3c6514a4 link: recognize thin archives in ld script detection
Needed for linking the Linux kernel.
2026-02-17 23:15:32 +01:00
Jacob Young a28d57292f IoUring: update to new Io APIs 2026-02-09 10:47:21 -05:00
Andrew Kelley 922ab8b8bc std: finish moving time to Io interface
Importantly, adds ability to get Clock resolution, which may be zero.
This allows error.Unexpected and error.ClockUnsupported to be removed
from timeout and clock reading error sets.
2026-02-02 23:02:31 -08:00
Jacob Young 90890fcb5c Io.Threaded: fix UAF-induced crashes during asynchronous operations
When `NtReadFile` returns `SUCCESS`, the APC routine still runs when
next alertable, which was previously clobbering an out of scope `done`.
Instead of adding an extra syscall to the success path, avoid all APC
side effects, allowing instant completions to return immediately.
2026-01-30 22:03:13 -08:00
Andrew Kelley 499ba5d55c compiler: use Io.MemoryMap
Also make setLength return error.OperationUnsupported when it cannot be
done atomically.
2026-01-22 21:25:53 -08:00
Andrew Kelley 1f1381a866 update API usage of std.crypto.random to io.random 2026-01-07 11:03:36 -08:00
Andrew Kelley a5b719e9eb compiler: fix build failures from std.Io-fs 2025-12-23 22:15:10 -08:00
Andrew Kelley 16bd2e137e compiler: fix most compilation errors from std.fs changes 2025-12-23 22:15:09 -08:00
Andrew Kelley 1925e0319f update lockStderrWriter sites
use the application's Io implementation where possible. This correctly
makes writing to stderr cancelable, fallible, and participate in the
application's event loop. It also removes one more hard-coded
dependency on a secondary Io implementation.
2025-12-23 22:15:09 -08:00
Andrew Kelley 54e4a3456c link: update to new file system APIs 2025-12-23 22:15:09 -08:00
Andrew Kelley 16f8af1b9a compiler: update various code to new fs API 2025-12-23 22:15:09 -08:00
Andrew Kelley 4a53e5b0b4 fix a handful of compilation errors related to std.fs migration 2025-12-23 22:15:08 -08:00
Andrew Kelley 1dcfc8787e update all readFileAlloc() to accept Io instance 2025-12-23 22:15:08 -08:00
Andrew Kelley 4be8be1d2b update all rename() to rename(io) 2025-12-23 22:15:08 -08:00
Andrew Kelley 9f4d40b1f9 update all stat() to stat(io) 2025-12-23 22:15:08 -08:00
Andrew Kelley 8328de24f1 update all occurrences of openFile to receive an io instance 2025-12-23 22:15:08 -08:00
Andrew Kelley 3204fb7569 update all occurrences of std.fs.File to std.Io.File 2025-12-23 22:15:07 -08:00
Andrew Kelley aafddc2ea1 update all occurrences of close() to close(io) 2025-12-23 22:15:07 -08:00
Matthew Lugg 18bc7e802f compiler: replace thread pool with std.Io
Eliminate the `std.Thread.Pool` used in the compiler for concurrency and
asynchrony, in favour of the new `std.Io.async` and `std.Io.concurrent`
primitives.

This removes the last usage of `std.Thread.Pool` in the Zig repository.
2025-12-22 12:55:16 +00:00
Ali Cheraghi dec1163fbb all: replace all @Type usages
Co-authored-by: Matthew Lugg <mlugg@mlugg.co.uk>
2025-11-22 22:42:38 +00:00
Benjamin Jurk 4b5351bc0d update deprecated ArrayListUnmanaged usage (#25958) 2025-11-20 14:46:23 -08:00
Andrew Kelley a9568ed296 Merge pull request #25898 from jacobly0/elfv2-progress
Elf2: more progress
2025-11-20 04:33:04 -08:00
Alex Rønne Petersen 9ab7eec23e represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.

While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
2025-11-14 11:33:35 +01:00
Jacob Young 61a1cefeb3 Elf2: implement PLT 2025-11-11 14:11:32 -05:00
Ryan Liptak f587209e04 Move/coalesce CompressDebugSections enum to std.zig.CompressDebugSections 2025-11-07 19:15:55 -08:00
Carl Åstholm 54f2a7c833 Move std.Target.SubSystem to std.zig.Subsystem
Also updates the field names to conform with the rest of std.
2025-11-05 01:31:26 +01:00
Jacob Young 5b060ef9d4 Merge pull request #25558 from jacobly0/elfv2-load-obj
Elf2: start implementing input object loading
2025-10-30 12:09:13 -04:00
Matthew Lugg 74931fe25c std.debug.lockStderrWriter: also return ttyconf
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
2025-10-30 09:31:28 +00:00
Jacob Young 6f0476e41d Elf2: start implementing input object loading 2025-10-29 18:05:49 -04:00
Andrew Kelley a072d821be Merge pull request #25592 from ziglang/init-std.Io
std: Introduce `Io` Interface
2025-10-29 13:51:37 -07:00
Alex Rønne Petersen a7119d4269 remove all IBM AIX and z/OS support
As with Solaris (dba1bf9353), we have no way to
actually audit contributions for these OSs. IBM also makes it even harder than
Oracle to actually obtain these OSs.

closes #23695
closes #23694
closes #3655
closes #23693
2025-10-29 14:25:51 +01:00
Andrew Kelley df4c30ca16 link: move the windows kernel bug workaround to Io implementation 2025-10-29 06:20:51 -07:00
Jacob Young 958faa7031 windows: workaround kernel race condition the most 2025-10-12 13:55:57 -04:00
Jacob Young 95242cc431 windows: workaround kernel race condition even more 2025-10-11 12:17:39 -04:00
Jacob Young 8efcfeaf1e windows: workaround kernel race condition better
Until I can do more testing, we bump the numbers until morale improves.
2025-10-11 10:01:17 -04:00
Jacob Young b2bc6073c8 windows: workaround kernel race condition
This was causing flaky CI failures.
2025-10-10 22:47:36 -07:00
Jacob Young 969f2cff82 Elf2: implement virtual allocation
This allows segments to be moved around in the output file without
needing to reapply relocations until virtual address space is exhaused.
2025-10-06 11:27:39 -07:00
Jacob Young 1fa11e0954 Coff: delete 2025-10-02 17:44:52 -04:00
Jacob Young e1f3fc6ce2 Coff2: create a new linker from scratch 2025-10-02 17:44:52 -04:00
Jacob Young f58200e3f2 Elf2: create a new linker from scratch
This iteration already has significantly better incremental support.

Closes #24110
2025-09-21 14:09:14 -07:00