Files
zig/lib
Koko Bhadra 4fa465fc8f compiler_rt: optimize udivmod large-divisor case with trial quotient
Replace the O(n) shift-subtract loop with a constant-time trial
quotient approach (Knuth Algorithm D, TAOCP Vol 2 Section 4.3.1).

The old code iterates clz(b_hi)-clz(a_hi)+1 times (up to 64
iterations of 128-bit arithmetic). The new code uses a single
divwide call to get a trial quotient, then verifies with two
native-width widening multiplies.

Benchmark (Apple M1, ReleaseFast):
- Large divisor, large shift: 87ns -> 7.5ns (11.5x faster)
- Small divisor / uniform: unchanged
2026-03-05 20:22:19 +01:00
..
2026-02-14 09:25:41 +01:00
2026-02-13 11:30:14 -08:00
2026-02-12 21:35:36 -08:00