Commit Graph

719 Commits

Author SHA1 Message Date
Andrew Kelley accd5701c2 compiler: move struct types into InternPool proper
Structs were previously using `SegmentedList` to be given indexes, but
were not actually backed by the InternPool arrays.

After this, the only remaining uses of `SegmentedList` in the compiler
are `Module.Decl` and `Module.Namespace`. Once those last two are
migrated to become backed by InternPool arrays as well, we can introduce
state serialization via writing these arrays to disk all at once.

Unfortunately there are a lot of source code locations that touch the
struct type API, so this commit is still work-in-progress. Once I get it
compiling and passing the test suite, I can provide some interesting
data points such as how it affected the InternPool memory size and
performance comparison against master branch.

I also couldn't resist migrating over a bunch of alignment API over to
use the log2 Alignment type rather than a mismash of u32 and u64 byte
units with 0 meaning something implicitly different and special at every
location. Turns out you can do all the math you need directly on the
log2 representation of alignments.
2023-09-21 14:48:40 -07:00
Andrew Kelley 8592c5cdac compiler: rework capture scopes in-memory layout
* Use 32-bit integers instead of pointers for compactness and
  serialization friendliness.
* Use a separate hash map for runtime and comptime capture scopes,
  avoiding the 1-bit union tag.
* Use a compact array representation instead of a tree of hash maps.
* Eliminate the only instance of ref-counting in the compiler, instead
  relying on garbage collection (not implemented yet but is the plan for
  almost all long-lived objects related to incremental compilation).

Because a code modification may need to access capture scope data, this
makes capture scope data long-lived state. My goal is to get incremental
compilation state serialization down to a single pwritev syscall, by
unifying the on-disk representation with the in-memory representation.
This commit eliminates the last remaining pointer field of
`Module.Decl`.
2023-09-15 00:55:07 -07:00
Jacob Young e2ff486de5 Sema: cleanup coerceExtra
* remove unreachable code
 * remove already handled cases
 * avoid `InternPool.getCoerced`
 * add some undef checks
 * error when converting undef int to float

Closes #16987
2023-08-30 16:50:30 -04:00
Jacob Young 7a251c4cb8 Sema: revert reference trace changes that are no longer needed 2023-08-28 17:43:37 -04:00
Jacob Young acd35a1aa7 Sema: factor out NeededComptimeReason from comptime value resolution
This makes the call sites easier to read, reduces the number of `catch`
expressions required, and prepares for comptime reasons to appear
earlier in the list of notes.
2023-08-28 17:25:39 -04:00
Jacob Young c6024691cf Sema: implement reference trace with called from here notes
Closes #15124
2023-08-28 12:28:25 -07:00
Andrew Kelley ada0010471 compiler: move unions into InternPool
There are a couple concepts here worth understanding:

Key.UnionType - This type is available *before* resolving the union's
fields. The enum tag type, number of fields, and field names, field
types, and field alignments are not available with this.

InternPool.UnionType - This one can be obtained from the above type with
`InternPool.loadUnionType` which asserts that the union's enum tag type
has been resolved. This one has all the information available.

Additionally:

* ZIR: Turn an unused bit into `any_aligned_fields` flag to help
  semantic analysis know whether a union has explicit alignment on any
  fields (usually not).
* Sema: delete `resolveTypeRequiresComptime` which had the same type
  signature and near-duplicate logic to `typeRequiresComptime`.
  - Make opaque types not report comptime-only (this was inconsistent
    between the two implementations of this function).
* Implement accepted proposal #12556 which is a breaking change.
2023-08-22 13:54:14 -07:00
Andrew Kelley 7ef1eb1c27 InternPool: safer enum API
The key changes in this commit are:

```diff
-        names: []const NullTerminatedString,
+        names: NullTerminatedString.Slice,
-        values: []const Index,
+        values: Index.Slice,
```

Which eliminates the slices from `InternPool.Key.EnumType` and replaces
them with structs that contain `start` and `len` indexes. This makes the
lifetime of `EnumType` change from expiring with updates to InternPool,
to expiring when the InternPool is garbage-collected, which is currently
never.

This is gearing up for a larger change I started working on locally
which moves union types into InternPool.

As a bonus, I fixed some unnecessary instances of `@as`.
2023-08-17 18:16:03 -07:00
mlugg 083ee8e0e2 InternPool: preserve indices of builtin types when resolved
Some builtin types have a special InternPool index (e.g.
`.type_info_type`) so that AstGen can refer to them before semantic
analysis. Unfortunately, this previously led to a second index existing
to refer to the type once it was resolved, complicating Sema by having
the concept of an "unresolved" type index.

This change makes Sema modify these InternPool indices in-place to
contain the expanded representation when resolved. The analysis of the
corresponding decls is caught in `Module.semaDecl`, and a field is set
on Sema telling it which index to place struct/union/enum types at. This
system could break if `std.builtin` contained complex decls which
evaluate multiple struct types, but this will be caught by the
assertions in `InternPool.resolveBuiltinType`.

The AstGen result types which were disabled in 6917a8c have been
re-enabled.

Resolves: #16603
2023-08-15 11:45:23 +01:00
mlugg 5e0107fbce Sema: remove redundant addConstant functions
After ff37ccd, interned values are trivial to convert to Air refs, using
`Air.internedToRef`. This made functions like `Sema.addConstant` effectively
redundant. This commit removes `Sema.addConstant` and `Sema.addType`, replacing
them with direct usages of `Air.internedToRef`.

Additionally, a new helper `Module.undefValue` is added, and the following
functions are moved into Module:
* `Sema.addConstUndef` -> `Module.undefRef`
* `Sema.addUnsignedInt` -> `Module.intRef` (now also works for signed types)

The general pattern here is that any `Module.xyzValue` helper may also have a
corresponding `Module.xyzRef` helper, which just wraps the call in
`Air.internedToRef`.
2023-08-11 11:02:24 -07:00
mlugg f2c8fa769a Sema: refactor generic calls to interleave argument analysis and parameter type resolution
AstGen provides all function call arguments with a result location,
referenced through the call instruction index. The idea is that this
should be the parameter type, but for `anytype` parameters, we use
generic poison, which is required to be handled correctly.

Previously, generic instantiations and inline calls worked by evaluating
all args in advance, before resolving generic parameter types. This
means any generic parameter (not just `anytype` ones) had generic poison
result types. This caused missing result locations in some cases.

Additionally, the generic instantiation logic caused `zirParam` to
analyze the argument types a second time before coercion. This meant
that for nominal types (struct/enum/etc), a *new* type was created,
distinct to the result type which was previously forwarded to the
argument expression.

This commit fixes both of these issues. Generic parameter type
resolution is now interleaved with argument analysis, so that we don't
have unnecessary generic poison types, and generic instantiation logic
now handles parameters itself rather than falling through to the
standard zirParam logic, so avoids duplicating the types.

Resolves: #16566
Resolves: #16258
Resolves: #16753
2023-08-10 10:00:26 +01:00
Zachary Raineri 0461a64a93 change uses of std.builtin.Mode to OptimizeMode (#16745)
std.builtin.Mode is deprecated.
2023-08-09 14:39:34 -04:00
Jacob Young 57470e833e Module: implement span for .call_arg of a @call
Closes #16750
2023-08-09 10:09:17 -04:00
Jacob Young 2ba787e303 Sema: restrict what can appear in a naked function
* Disable runtime calls, since it is not possible to know the proper
   stack adjustment to follow the callee abi.
 * Disable runtime returns, since it is not possible to know where the
   return address is stored in general.
 * Allow implicit returns regardless of the return type, which allows
   naked functions with a non-void return type to be written.
2023-07-31 01:58:10 -04:00
Jacob Young ff8a49448c llvm: finish converting lowerValue 2023-07-19 23:38:40 -04:00
Andrew Kelley 8daa8d255b Sema: fix fn_proto_param LazySrcLoc resolution
to match source code span from merge-base.
2023-07-18 19:02:06 -07:00
Andrew Kelley 727b371bbc Sema: fix source location crash for function prototypes 2023-07-18 19:02:06 -07:00
Andrew Kelley 0153f3a8f9 Sema: fix crash: array_in_c_exported_function
Fuck it, we're storing decl indexes in LazySrcLoc now.
2023-07-18 19:02:06 -07:00
Andrew Kelley 3f2a4720b1 compiler: fix branch regressions
* getOwnedFunctionIndex no longer checks if the value is actually a
   function.
 * The callsites to `intern` that I added want to avoid the `getCoerced`
   call, so I added `intern2`.
 * Adding to inferred error sets should not happen if the destination
   error set is not the inferred error set of the current Sema instance.
 * adhoc_inferred_error_set_type can be seen by the backend. Treat it
   like anyerror.
2023-07-18 19:02:06 -07:00
Andrew Kelley d15e8f8017 Sema: resolve inferred error set with function state in_progress
This way dependency loops are reported instead of the compiler crashing.
2023-07-18 19:02:06 -07:00
Andrew Kelley c193872c81 InternPool: implement indexToKey for func_instance and func_decl
Also delete incorrect frees an arena-allocated parameters.
2023-07-18 19:02:05 -07:00
Andrew Kelley f3dc53f6b5 compiler: rework inferred error sets
* move inferred error sets into InternPool.
   - they are now represented by pointing directly at the corresponding
     function body value.
 * inferred error set working memory is now in Sema and expires after
   the Sema for the function corresponding to the inferred error set is
   finished having its body analyzed.
 * error sets use a InternPool.Index.Slice rather than an actual slice
   to avoid lifetime issues.
2023-07-18 19:02:05 -07:00
Andrew Kelley 55e89255e1 compiler: begin untangling anonymous decls from source decls
The idea here is to move towards a future where anonymous decls are
represented entirely by an `InternPool.Index`. This was needed to start
implementing `InternPool.getFuncDecl` which requires moving creation and
deletion of Decl objects into InternPool.

 * remove `Namespace.anon_decls`
 * remove the concept of cleaning up resources from anonymous decls,
   relying on InternPool instead.
 * move namespace and decl object allocation into InternPool
2023-07-18 19:02:05 -07:00
Andrew Kelley db33ee45b7 rework generic function calls
Abridged summary:

 * Move `Module.Fn` into `InternPool`.
 * Delete a lot of confusing and problematic `Sema` logic related to
   generic function calls.

This commit removes `Module.Fn` and replaces it with two new
`InternPool.Tag` values:

 * `func_decl` - corresponding to a function declared in the source
   code. This one contains line/column numbers, zir_body_inst, etc.

 * `func_instance` - one for each monomorphization of a generic
   function. Contains a reference to the `func_decl` from whence the
   instantiation came, along with the `comptime` parameter values (or
   types in the case of `anytype`)

Since `InternPool` provides deduplication on these values, these fields
are now deleted from `Module`:

 * `monomorphed_func_keys`
 * `monomorphed_funcs`
 * `align_stack_fns`

Instead of these, Sema logic for generic function instantiation now
unconditionally evaluates the function prototype expression for every
generic callsite. This is technically required in order for type
coercions to work. The previous code had some dubious, probably wrong
hacks to make things work, such as `hashUncoerced`. I'm not 100% sure
how we were able to eliminate that function and still pass all the
behavior tests, but I'm pretty sure things were still broken without
doing type coercion for every generic function call argument.

After the function prototype is evaluated, it produces a deduplicated
`func_instance` `InternPool.Index` which can then be used for the
generic function call.

Some other nice things made by this simplification are the removal of
`comptime_args_fn_inst` and `preallocated_new_func` from `Sema`, and the
messy logic associated with them.

I have not yet been able to measure the perf of this against master
branch. On one hand, it reduces memory usage and pointer chasing of the
most heavily used `InternPool` Tag - function bodies - but on the other
hand, it does evaluate function prototype expressions more than before.
We will soon find out.
2023-07-18 19:02:05 -07:00
Jacob Young 70c71935c7 cbe: fix pointers to aliases of extern values 2023-07-18 17:58:39 -07:00
r00ster91 2b8687ba2d std.math.big.int: better name for equal function
All of the std except these few functions call it "eql" instead of "eq".
This has previously tripped me up when I expected the equality check function to be called "eql"
(just like all the rest of the std) instead of "eq".

The motivation is consistency.

If search "eq" on Autodoc, these functions stick out and it looks inconsistent.

I just noticed there are also a few functions spelling it out as "equal" (such as std.mem.allEqual).
Maybe those functions should also spell it "eql" but that can be done in a future PR.
2023-07-03 11:00:13 -07:00
Jacob Young 2282c27885 Remerge pull request #15995 from mlugg/fix/union-field-ptr-align
Sema: copy pointer alignment to union field pointers

This is an unrevert of 43c98dc115.
2023-06-30 23:23:26 -04:00
mlugg 730f2e0407 Sema: copy pointer alignment to struct field pointers 2023-06-30 23:23:03 -04:00
Jacob Young 43c98dc115 Revert "Merge pull request #15995 from mlugg/fix/union-field-ptr-align"
This reverts commit 40cf3f7ae5, reversing
changes made to d98147414d.
2023-06-29 00:23:19 -04:00
Andrew Kelley 40cf3f7ae5 Merge pull request #15995 from mlugg/fix/union-field-ptr-align
Sema: copy pointer alignment to union field pointers
2023-06-25 23:06:53 -07:00
mlugg 80e493cb36 Sema: copy pointer alignment to struct field pointers 2023-06-25 14:13:52 +01:00
mlugg 569ae762e1 compiler: allow cast builtins to coerce result to error union or optional
Also improves some error messages
2023-06-25 13:28:32 +01:00
Andrew Kelley 9684947faa compiler: start moving safety-checks into backends
This actually used to be how it worked in stage1, and there was this
issue to change it: #2649

So this commit is a reversal to that idea. One motivation for that issue
was avoiding emitting the panic handler in compilations that do not have
any calls to panic. This commit only resolves the panic handler in the
event of a safety check function being emitted, so it does not have that
flaw.

The other reason given in that issue was for optimizations that elide
safety checks. It's yet to be determined whether that was a good idea or
not; this can get re-explored when we start adding optimization passes
to AIR.

This commit adds these AIR instructions, which are only emitted if
`backendSupportsFeature(.safety_checked_arithmetic)` is true:
 * add_safe
 * sub_safe
 * mul_safe

It removes these nonsensical AIR instructions:
 * addwrap_optimized
 * subwrap_optimized
 * mulwrap_optimized

The safety-checked arithmetic functions push the burden of invoking the
panic handler into the backend. This makes for a messier compiler
implementation, but it reduces the amount of AIR instructions emitted by
Sema, which reduces time spent in the secondary bottleneck of the
compiler. It also generates more compact LLVM IR, reducing time spent in
the primary bottleneck of the compiler.

Finally, it eliminates 1 stack allocation per safety-check which was
being used to store the resulting tuple. These allocations were going to
be annoying when combined with suspension points.
2023-06-25 01:41:08 -07:00
mlugg f26dda2117 all: migrate code to new cast builtin syntax
Most of this migration was performed automatically with `zig fmt`. There
were a few exceptions which I had to manually fix:

* `@alignCast` and `@addrSpaceCast` cannot be automatically rewritten
* `@truncate`'s fixup is incorrect for vectors
* Test cases are not formatted, and their error locations change
2023-06-24 16:56:39 -07:00
Andrew Kelley 5288929ffd sema.addConstant: remove type parameter
Now that InternPool migration is finished, all values have types. So
only the value parameter is required.
2023-06-23 21:59:42 -07:00
Jacob Young 96cdd51c14 Type: delete legacy allocation functions 2023-06-20 14:02:09 -04:00
Andrew Kelley a72d634b73 Merge pull request #16046 from BratishkaErik/issue-6128
Renaming `@xtoy` to `@YfromX`
2023-06-19 22:36:24 -07:00
Jacob Young 8fcc28d302 Module: add support for multiple global asm blocks per decl
Closes #16076
2023-06-19 13:12:04 -07:00
Eric Joldasov 50339f595a all: zig fmt and rename "@XToY" to "@YFromX"
Signed-off-by: Eric Joldasov <bratishkaerik@getgoogleoff.me>
2023-06-19 12:34:42 -07:00
Motiejus Jakštys d41111d7ef mem: rename align*Generic to mem.align*
Anecdote 1: The generic version is way more popular than the non-generic
one in Zig codebase:

     git grep -w alignForward | wc -l
    56
     git grep -w alignForwardGeneric | wc -l
    149

     git grep -w alignBackward | wc -l
    6
     git grep -w alignBackwardGeneric | wc -l
    15

Anecdote 2: In my project (turbonss) that does much arithmetic and
alignment I exclusively use the Generic functions.

Anecdote 3: we used only the Generic versions in the Macho Man's linker
workshop.
2023-06-17 12:49:13 -07:00
mlugg 11d0dfb882 Sema: fix @intToPtr of zero value to optional pointer
Calling into coercion logic here is a little opaque, and more to the
point wholly unnecessary. Instead, the (very short) logic is now
implemented directly in Sema.

Resolves: #16033
2023-06-15 02:43:06 -07:00
mlugg c9531eb833 Sema: rewrite peer type resolution
The existing logic for peer type resolution was quite convoluted and
buggy. This rewrite makes it much more resilient, readable, and
extensible. The algorithm works by first iterating over the types to
select a "strategy", then applying that strategy, possibly applying peer
resolution recursively.

Several new tests have been added to cover cases which the old logic did
not correctly handle.

Resolves: #15138
Resolves: #15644
Resolves: #15693
Resolves: #15709
Resolves: #15752
2023-06-13 21:48:18 +01:00
mlugg 42dc7539c5 Fix bad source locations in switch capture errors
To do this, I expanded SwitchProngSrc a bit. Several of the tags there
aren't actually used by any current errors, but they're there for
consistency and if we ever need them.

Also delete a now-redundant test and fix another.
2023-06-13 12:55:27 +01:00
mlugg 00609e7edb Eliminate switch_capture and switch_capture_ref ZIR tags
These tags are unnecessary, as this information can be more efficiently
encoded within the switch_block instruction itself. We also use a neat
little trick to avoid needing a dummy instruction (like is used for
errdefer captures): since the switch_block itself cannot otherwise be
referenced within a prong, we can repurpose its index within prongs to
refer to the captured value.
2023-06-13 12:53:18 +01:00
Jacob Young d37ebfcf23 InternPool: avoid as many slices pointing to string_bytes as possible
These are frequently invalidated whenever a string is interned, so avoid
creating pointers to `string_bytes` wherever possible.  This is an
attempt to fix random CI failures.
2023-06-11 23:45:09 -07:00
mlugg 2a6b91874a stage2: pass most test cases under InternPool
All but 2 test cases now pass (tested on x86_64 Linux, native only). The
remaining two signify an issue requiring a larger refactor, which I will
do in a separate commit.

Notable changes:
* Fix uninitialized memory when allocating objects from free lists
* Implement TypedValue printing for pointers
* Fix some TypedValue printing logic
* Work around non-existence of InternPool.remove implementation
2023-06-10 20:51:10 -07:00
Jacob Young e23b0a01e6 InternPool: fix yet more key lifetime issues 2023-06-10 20:47:59 -07:00
mlugg 0fd52cdc5e InternPool: avoid aggregate null bytes storage
This is a workaround for InternPool currently not handling
non-null-terminated strings. It avoids using the `bytes` storage for
aggregates if there are any null bytes.

In the future this should be changed so that the `bytes` storage can be
used regardless of whether there are any null bytes. This is important
for use cases such as `@embedFile`.

However, this fixes a bug for now, and after this commit, stage2
self-hosts again.

mlugg: stage5 passes all enabled behavior tests on my system.

Commit message edited by Andrew Kelley <andrew@ziglang.org>
2023-06-10 20:47:59 -07:00
Jacob Young da24ea7f36 Sema: rewrite monomorphed_funcs usage
In an effort to delete `Value.hashUncoerced`, generic instantiation has
been redesigned.  Instead of just storing instantiations in
`monomorphed_funcs`, partially instantiated generic argument types are
also cached.  This isn't quite the single `getOrPut` that it used to be,
but one `get` per generic argument plus one get for the instantiation,
with an equal number of `put`s per unique instantiation isn't bad.
2023-06-10 20:47:59 -07:00
Andrew Kelley 35550c840b Module: fix populateTestFunctions UAF 2023-06-10 20:47:59 -07:00