mirror of
https://codeberg.org/ziglang/zig.git
synced 2026-06-01 05:46:22 +03:00
97fe49a80f
Sorry for the mega-commit, this diff got a little out of control. The main thing here is a complete rework of how Elf2 handles the symbol table. I messed around with the design for a while and landed on something which is fairly memory-efficient (in particular the overhead for STB_LOCAL symbols is as low as possible) and fulfils some of the more awkward constraints of the ELF format. The main such constraint is that all STB_LOCAL symbols in a symbol table are required to appear before any STB_GLOBAL/STB_WEAK symbols. This is further complicated by the fact that when producing a DSO, symbols with STV_HIDDEN or STV_INTERNAL visibility are required to have STB_LOCAL binding in the symbol table, even though they are global symbols from the perspective of the link editor. Plus, when combining multiple symbols with the same name, the resulting visibility is the strictest of all of the inputs, so it is possible at any point in compilation to discover an extern/export symbol which forces an existing STB_GLOBAL symbol to become STB_LOCAL and therefore requires it to move to an earlier symtab index. Dealing with all of this was quite awkward. But I got there! I also implemented a lot of features in the process. I don't remember everything perfectly, but here's a vague list: * Multiple definitions of and/or unresolved references to symbols are now combined correctly in all cases * `.bss` sections from inputs are correctly lowered (we don't actually emit a `.bss` section of our own yet, but I was able to put that data into the `.data` section so that the functionality is correct) * Relocations in link inputs are now always processed (previously they would be silently ignored in most cases) * Linker errors are triggered if a supported input section has a relocation which targets an unsupported input section (previously the unsupported section's symbol was dropped and associated relocations would be silently ignored) * When linking a static executable, an error is emitted if a required symbol (i.e. an undefined reference with strong linkage) was never defined * Duplicate symbol errors now work correctly * When emitting a relocatable, the offsets of relocation entries are now correct (previously the offsets written were relative to a symbol rather than a section, meaning that e.g. almost all text relocations were just in a single function) The changes in all of the other linkers and codegen backends are some added type-safety at the codegen-linker API boundary. There are now distinct `u32`-backed types for identifying an "atom" (the thing we're codegenning) and a "symbol" (the thing which a relocation targets). Linker implementations can use a couple of private helper functions to convert between this implementation-agnostic type and their specific type; for instance, `Elf2` can convert between a `Symbol.Id` and a `link.File.SymbolId` with `Symbol.Id.fromTypeErased` and `Symbol.Id.toTypeErased`. I didn't implement this nicely for any other linker, so right now there's a lot of `@enumFromInt`/`@intFromEnum` sprinkled all over the place, particularly with the legacy ELF and Mach-O linkers. I tested that I could still perform incremental updates to the Zig compiler using this commit. In terms of the new behaviors, the most interesting stuff is symbol and relocation resolution, so I ran a few tests involving building a "Hello World" binary in various different ways: * `build-exe` correctly succeeds * `build-exe -fno-compiler-rt` correctly reports undefined symbols * `build-obj` linked with `build-exe` correctly succeeds * `build-obj` linked with `build-exe -fno-compiler-rt` correctly reports undefined symbols * `build-obj -fcompiler-rt` linked with `build-exe -fno-compiler-rt` correctly succeeds * `build-obj -fcompiler-rt` linked with `build-exe` correctly succeeds (the compiler-rt symbols are weak so the global symbols are arbitrarily resolved to one of the two implementations) I also manually verified with `readelf` that symbol tables were always ordered correctly (before this PR, `readelf -s` would usually emit warnings about incorrectly-ordered symtabs!), and verified that various visibility attributes worked as expected. No actual test coverage is added due to the current lack of a useful linker test harness. Once a good test harness is available I will be willing to write some tests.