We can directly access this path string from the PEB (albeit with some
weirdness around addressing) and that ends up making the downstream code
simpler and more efficient. (Almost like the kernel32 API isn't very
good!)
The change in codegen/x86_64/CodeGen.zig was not strictly necessary (the
Sema change I did solves the error I was getting there), I just think
it's better style anyway.
The dependency on advapi32.dll actually silently brings along 3 other dlls at runtime (msvcrt.dll, sechost.dll, bcrypt.dll), even if no advapi32 APIs are called. So, this commit actually reduces the number of dlls loaded at runtime by 4 (but only when LLVM is not linked, since LLVM has its own dependency on advapi32.dll).
The data is not super conclusive, but the ntdll version of WindowsSdk appears to run slightly faster than the previous advapi32 version:
Benchmark 1: libc-ntdll.exe ..
Time (mean ± σ): 6.0 ms ± 0.6 ms [User: 3.9 ms, System: 7.1 ms]
Range (min … max): 4.8 ms … 7.9 ms 112 runs
Benchmark 2: libc-advapi32.exe ..
Time (mean ± σ): 7.2 ms ± 0.5 ms [User: 5.4 ms, System: 9.2 ms]
Range (min … max): 6.1 ms … 8.9 ms 103 runs
Summary
'libc-ntdll.exe ..' ran
1.21 ± 0.15 times faster than 'libc-advapi32.exe ..'
and this mostly seems to be due to changes in the implementation (the advapi32 APIs do a lot of NtQueryKey calls that the new implementation doesn't do) rather than due to the decrease in dll loading. LLVM-less zig binaries don't show the same reduction (the only difference here is the DLLs being loaded):
Benchmark 1: stage4-ntdll\bin\zig.exe version
Time (mean ± σ): 3.0 ms ± 0.6 ms [User: 5.3 ms, System: 4.8 ms]
Range (min … max): 1.3 ms … 4.2 ms 112 runs
Benchmark 2: stage4-advapi32\bin\zig.exe version
Time (mean ± σ): 3.5 ms ± 0.6 ms [User: 6.9 ms, System: 5.5 ms]
Range (min … max): 2.5 ms … 5.9 ms 111 runs
Summary
'stage4-ntdll\bin\zig.exe version' ran
1.16 ± 0.28 times faster than 'stage4-advapi32\bin\zig.exe version'
---
With the removal of the advapi32 dependency, the non-ntdll dependencies that remain in an LLVM-less Zig binary are ws2_32.dll (which brings along rpcrt4.dll at runtime), kernel32.dll (which brings along kernelbase.dll at runtime), and crypt32.dll (which brings along ucrtbase.dll at runtime).
- remove error.SharingViolation from all error sets since it has the
same meaning as FileBusy
- add error.FileBusy to CreateFileAtomicError and ReadLinkError
- update dirReadLinkWindows to use NtCreateFile and NtFsControlFile and
integrate with cancelation properly.
- move windows CTL_CODE constants to the proper namespace
- delete os.windows.ReadLink
When targeting x86-windows, this parameter referring to read-only memory can result in an ACCESS_VIOLATION error, and this has been seen when using FILE_DISPOSITION_INFORMATION_EX. It's unclear how exactly this ACCESS_VIOLATION is occurring, though, as the memory does not actually change before/after the call.
Closes https://codeberg.org/ziglang/zig/issues/30802
The most interesting thing here is the replacement of the pthread futex
implementation with an implementation based on thread park/unpark APIs.
Thread parking tends to be the primitive provided by systems which do
not have a futex primitive, such as NetBSD, so this implementation is
far more efficient than the pthread one. It is also useful on Windows,
where `RtlWaitOnAddress` is itself a userland implementation based on
thread park/unpark; we can implement it ourselves including support for
features which Windows' implementation lacks, such as cancelation and
waking a number of waiters with 1<n<infinity.
Compared to the pthread implementation, this thread-parking-based one
also supports full robust cancelation. Thread parking also turns out to
be useful for implementing `sleep`, so is now used for that on Windows
and NetBSD.
This commit also introduces proper cancelation support for most Windows
operations. The most notable omission right now is DNS lookups through
`GetAddrInfoEx`, just because they're a little more work due to having
a unique cancelation mechanism---but the machinery is all there, so I'll
finish gluing it together soon.
As of this commit, there are very few parts of `Io.Threaded` which do
not support full robust cancelation. The only ones which actually really
matter (because they could block for a prolonged period of time) are DNS
lookups on Windows (as discussed above) and futex waits on WASM.
There's a good argument to not have this in the std lib but it's more
work to remove it than to leave it in, and this branch is already
20,000+ lines changed.
This is one way of addressing/closing https://github.com/ziglang/zig/issues/16738
Previously, there was a mismatch between the default behaviors on Windows vs other platforms, where Windows was implicitly using .NON_DIRECTORY_FILE for its `openFile` implementation which caused `error.IsDir` when opening a directory, while on other platforms there is no equivalent flag for the `open` syscall. This meant that `openFile` on a path of a directory would fail on Windows but succeed on other platforms.
Adding `allow_directory` to `File.OpenFlags` serves two purposes:
1. It provides a cross-platform way to get the `.NON_DIRECTORY_FILE` behavior in the most efficient available way for the platform (on Windows, no extra syscalls are required, on other systems, an extra `fstat` is required)
2. It allows `statFile` to be implemented on top of `openFile` on Windows while still allowing `statFile` to work on directory paths. Before this commit, `statFile` on a directory path on Windows failed with `error.IsDir`
Note: The second purpose could have been addressed in different ways (bespoke call to NtCreateFile in the `statFile` implementation to avoid passing `NON_DIRECTORY_FILE`, or just never pass `NON_DIRECTORY_FILE` in the `openFile` implementation), so the first purpose is the more relevant/motivating force behind this change.
The default being `true` is intended to cut down on the number of syscalls as much as possible when using the default flags.
The number of possible errors in the theoretical error set of this function is very large, while each individual call to the function is only concerned with a (potentially small) subset of those errors that are specific to the control code being used. This commit makes the callers determine which statuses they are interested in to avoid an ever-ballooning error set and ever-growing switch cases at each call site that throw away most of those errors.