<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Eli Bendersky's website - Compilation</title><link href="https://eli.thegreenplace.net/" rel="alternate"></link><link href="https://eli.thegreenplace.net/feeds/compilation.atom.xml" rel="self"></link><id>https://eli.thegreenplace.net/</id><updated>2026-04-10T02:28:00-07:00</updated><entry><title>watgo - a WebAssembly Toolkit for Go</title><link href="https://eli.thegreenplace.net/2026/watgo-a-webassembly-toolkit-for-go/" rel="alternate"></link><published>2026-04-09T19:28:00-07:00</published><updated>2026-04-10T02:28:00-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2026-04-09:/2026/watgo-a-webassembly-toolkit-for-go/</id><summary type="html">&lt;p&gt;I'm happy to announce the general availability of &lt;a class="reference external" href="https://github.com/eliben/watgo"&gt;watgo&lt;/a&gt;
- the &lt;strong&gt;W&lt;/strong&gt;eb&lt;strong&gt;A&lt;/strong&gt;ssembly &lt;strong&gt;T&lt;/strong&gt;oolkit for &lt;strong&gt;G&lt;/strong&gt;o. This project is similar to
&lt;a class="reference external" href="https://github.com/webassembly/wabt"&gt;wabt&lt;/a&gt; (C++) or
&lt;a class="reference external" href="https://github.com/bytecodealliance/wasm-tools"&gt;wasm-tools&lt;/a&gt; (Rust), but in
pure, zero-dependency Go.&lt;/p&gt;
&lt;p&gt;watgo comes with a CLI and a Go API to parse WAT (WebAssembly Text), validate
it …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I'm happy to announce the general availability of &lt;a class="reference external" href="https://github.com/eliben/watgo"&gt;watgo&lt;/a&gt;
- the &lt;strong&gt;W&lt;/strong&gt;eb&lt;strong&gt;A&lt;/strong&gt;ssembly &lt;strong&gt;T&lt;/strong&gt;oolkit for &lt;strong&gt;G&lt;/strong&gt;o. This project is similar to
&lt;a class="reference external" href="https://github.com/webassembly/wabt"&gt;wabt&lt;/a&gt; (C++) or
&lt;a class="reference external" href="https://github.com/bytecodealliance/wasm-tools"&gt;wasm-tools&lt;/a&gt; (Rust), but in
pure, zero-dependency Go.&lt;/p&gt;
&lt;p&gt;watgo comes with a CLI and a Go API to parse WAT (WebAssembly Text), validate
it, and encode it into WASM binaries; it also supports decoding WASM from its
binary format.&lt;/p&gt;
&lt;p&gt;At the center of it all is &lt;a class="reference external" href="https://pkg.go.dev/github.com/eliben/watgo/wasmir"&gt;wasmir&lt;/a&gt; - a semantic
representation of a WebAssembly module that users can examine (and manipulate).
This diagram shows the functionalities provided by watgo:&lt;/p&gt;
&lt;img alt="Block diagram showing the different parts of watgo; described in the next paragraph" class="align-center" src="https://eli.thegreenplace.net/images/2026/watgo-diagram.png" /&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Parse: a parser from WAT to &lt;tt class="docutils literal"&gt;wasmir&lt;/tt&gt;&lt;/li&gt;
&lt;li&gt;Validate: uses the official WebAssembly validation semantics to check that the
module is well formed and safe&lt;/li&gt;
&lt;li&gt;Encode: emits &lt;tt class="docutils literal"&gt;wasmir&lt;/tt&gt; into WASM binary representation&lt;/li&gt;
&lt;li&gt;Decode: read WASM binary representation into &lt;tt class="docutils literal"&gt;wasmir&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="section" id="cli-use-case"&gt;
&lt;h2&gt;CLI use case&lt;/h2&gt;
&lt;p&gt;watgo comes with a CLI, which you can install by issuing this command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;go install github.com/eliben/watgo/cmd/watgo@latest
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The CLI aims to be compatible with wasm-tools &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;, and I've already switched my
&lt;a class="reference external" href="https://github.com/eliben/wasm-wat-samples"&gt;wasm-wat-samples&lt;/a&gt; projects to
use it; e.g. a command to parse a WAT file, validate it and encode it into
binary format:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;watgo parse stack.wat -o stack.wasm
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="api-use-case"&gt;
&lt;h2&gt;API use case&lt;/h2&gt;
&lt;p&gt;&lt;tt class="docutils literal"&gt;wasmir&lt;/tt&gt; semantically represents a WASM module with an API that's easy to work
with. Here's an example of using watgo to parse a simple WAT
program and do some analysis:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;package&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;fmt&amp;quot;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;github.com/eliben/watgo&amp;quot;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="s"&gt;&amp;quot;github.com/eliben/watgo/wasmir&amp;quot;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;wasmText&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;`&lt;/span&gt;
&lt;span class="s"&gt;(module&lt;/span&gt;
&lt;span class="s"&gt;  (func (export &amp;quot;add&amp;quot;) (param i32 i32) (result i32)&lt;/span&gt;
&lt;span class="s"&gt;    local.get 0&lt;/span&gt;
&lt;span class="s"&gt;    local.get 1&lt;/span&gt;
&lt;span class="s"&gt;    i32.add&lt;/span&gt;
&lt;span class="s"&gt;  )&lt;/span&gt;
&lt;span class="s"&gt;  (func (param f32 i32) (result i32)&lt;/span&gt;
&lt;span class="s"&gt;    local.get 1&lt;/span&gt;
&lt;span class="s"&gt;    i32.const 1&lt;/span&gt;
&lt;span class="s"&gt;    i32.add&lt;/span&gt;
&lt;span class="s"&gt;    drop&lt;/span&gt;
&lt;span class="s"&gt;    i32.const 0&lt;/span&gt;
&lt;span class="s"&gt;  )&lt;/span&gt;
&lt;span class="s"&gt;)`&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="kd"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;watgo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ParseWAT&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="nb"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;wasmText&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;!=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;nil&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;panic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;i32Params&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;localGets&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;i32Adds&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// Module-defined functions carry a type index into m.Types. The function&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;// body itself is a flat sequence of wasmir.Instruction values.&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;fn&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;range&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Funcs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nx"&gt;sig&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Types&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TypeIdx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;param&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;range&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;sig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Params&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;param&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Kind&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;==&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;wasmir&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ValueKindI32&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nx"&gt;i32Params&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;instr&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;range&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Body&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;switch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;instr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Kind&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;wasmir&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;InstrLocalGet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nx"&gt;localGets&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="k"&gt;case&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;wasmir&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;InstrI32Add&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nx"&gt;i32Adds&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;module-defined funcs: %d\n&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Funcs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;i32 params: %d\n&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;i32Params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;local.get instructions: %d\n&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;localGets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nx"&gt;fmt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;i32.add instructions: %d\n&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;i32Adds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;One important note: the WAT format supports several syntactic niceties that
are flattened / canonicalized when lowered to &lt;tt class="docutils literal"&gt;wasmir&lt;/tt&gt;. For example, all folded
instructions are lowered to unfolded ones (linear form), function &amp;amp; type
names are resolved to numeric indices, etc. This matches the validation and
execution semantics of WASM and its binary representation.&lt;/p&gt;
&lt;p&gt;These syntactic details are present in watgo in the &lt;tt class="docutils literal"&gt;textformat&lt;/tt&gt; package
(which parses WAT into an AST) and are removed when this is lowered to &lt;tt class="docutils literal"&gt;wasmir&lt;/tt&gt;.
The &lt;tt class="docutils literal"&gt;textformat&lt;/tt&gt; package is kept internal at this time, but in the future I
may consider exposing it publicly - if there's interest.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="testing-strategy"&gt;
&lt;h2&gt;Testing strategy&lt;/h2&gt;
&lt;p&gt;Even though it's still early days for watgo, I'm reasonably confident in its
correctness due to a strategy of very heavy testing right from the start.&lt;/p&gt;
&lt;p&gt;WebAssembly comes with a &lt;a class="reference external" href="https://github.com/WebAssembly/spec/"&gt;large official test suite&lt;/a&gt;,
which is perfect for end-to-end testing of new implementations.
The core test suite includes almost 200K lines of WAT files that carry several
modules with expected execution semantics and a variety of error scenarios
exercised. These live in specially designed &lt;a class="reference external" href="https://github.com/WebAssembly/spec/tree/main/interpreter#scripts"&gt;.wast files&lt;/a&gt; and
leverage a custom spec interpreter.&lt;/p&gt;
&lt;p&gt;watgo hijacks this approach by using the official test suite for its own
testing. A custom harness parses .wast files and uses watgo to convert the WAT
in them to binary WASM, which is then executed by Node.js &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;; this harness is
a significant effort in itself, but it's very much worth it - the result is
excellent testing coverage. watgo passes the entire WASM spec core test suite.&lt;/p&gt;
&lt;p&gt;Similarly, we leverage &lt;a class="reference external" href="https://github.com/WebAssembly/wabt/tree/main/test/interp"&gt;wabt's interp test suite&lt;/a&gt; which also
includes end-to-end tests, using a simpler Node-based harness to test them
against watgo.&lt;/p&gt;
&lt;p&gt;Finally, I maintain a collection of realistic program samples written in
WAT in the &lt;a class="reference external" href="https://github.com/eliben/wasm-wat-samples"&gt;wasm-wat-samples repository&lt;/a&gt;;
these are also used by watgo to test itself.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Though not all of wasm-tools's functionality is supported yet.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;To stick to a pure-Go approach also for testing, I originally tried
using wazero for this, but had to give up because wazero doesn't support
some of the recent WASM proposals that have already made it into the
standard (most notably Garbage Collection).&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Go"></category><category term="WebAssembly"></category><category term="Compilation"></category></entry><entry><title>Rewriting pycparser with the help of an LLM</title><link href="https://eli.thegreenplace.net/2026/rewriting-pycparser-with-the-help-of-an-llm/" rel="alternate"></link><published>2026-02-04T19:35:00-08:00</published><updated>2026-02-05T03:38:39-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2026-02-04:/2026/rewriting-pycparser-with-the-help-of-an-llm/</id><summary type="html">&lt;p&gt;&lt;a class="reference external" href="https://github.com/eliben/pycparser"&gt;pycparser&lt;/a&gt; is my most widely used open
source project (with ~20M daily downloads from PyPI &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;). It's a pure-Python
parser for the C programming language, producing ASTs inspired by &lt;a class="reference external" href="https://docs.python.org/3/library/ast.html"&gt;Python's
own&lt;/a&gt;. Until very recently, it's
been using &lt;a class="reference external" href="https://www.dabeaz.com/ply/ply.html"&gt;PLY: Python Lex-Yacc&lt;/a&gt; for
the core parsing.&lt;/p&gt;
&lt;p&gt;In this post, I'll describe how …&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;a class="reference external" href="https://github.com/eliben/pycparser"&gt;pycparser&lt;/a&gt; is my most widely used open
source project (with ~20M daily downloads from PyPI &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;). It's a pure-Python
parser for the C programming language, producing ASTs inspired by &lt;a class="reference external" href="https://docs.python.org/3/library/ast.html"&gt;Python's
own&lt;/a&gt;. Until very recently, it's
been using &lt;a class="reference external" href="https://www.dabeaz.com/ply/ply.html"&gt;PLY: Python Lex-Yacc&lt;/a&gt; for
the core parsing.&lt;/p&gt;
&lt;p&gt;In this post, I'll describe how I collaborated with an LLM coding agent (Codex)
to help me rewrite pycparser to use a hand-written recursive-descent parser and
remove the dependency on PLY. This has been an interesting experience and the
post contains lots of information and is therefore quite long; if you're just
interested in the final result, check out the latest code of pycparser - the
&lt;tt class="docutils literal"&gt;main&lt;/tt&gt; branch already has the new implementation.&lt;/p&gt;
&lt;img alt="meme picture saying &amp;quot;can't come to bed because my AI agent produced something slightly wrong&amp;quot;" class="align-center" src="https://eli.thegreenplace.net/images/2026/cantcometobed.png" /&gt;
&lt;div class="section" id="the-issues-with-the-existing-parser-implementation"&gt;
&lt;h2&gt;The issues with the existing parser implementation&lt;/h2&gt;
&lt;p&gt;While pycparser has been working well overall, there were a number of nagging
issues that persisted over years.&lt;/p&gt;
&lt;div class="section" id="parsing-strategy-yacc-vs-hand-written-recursive-descent"&gt;
&lt;h3&gt;Parsing strategy: YACC vs. hand-written recursive descent&lt;/h3&gt;
&lt;p&gt;I began working on pycparser in 2008, and back then using a YACC-based approach
for parsing a whole language like C seemed like a no-brainer to me. Isn't this
what everyone does when writing a serious parser? Besides, the K&amp;amp;R2 book
famously carries the entire grammar of the C99 language in an appendix - so it
seemed like a simple matter of translating that to PLY-yacc syntax.&lt;/p&gt;
&lt;p&gt;And indeed, it wasn't &lt;em&gt;too&lt;/em&gt; hard, though there definitely were some complications
in building the ASTs for declarations (C's &lt;a class="reference external" href="https://eli.thegreenplace.net/2008/10/18/implementing-cdecl-with-pycparser"&gt;gnarliest part&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Shortly after completing pycparser, I got more and more interested in compilation
and started learning about the different kinds of parsers more seriously. Over
time, I grew convinced that &lt;a class="reference external" href="https://eli.thegreenplace.net/tag/recursive-descent-parsing"&gt;recursive descent&lt;/a&gt; is the way to
go - producing parsers that are easier to understand and maintain (and are often
faster!).&lt;/p&gt;
&lt;p&gt;It all ties in to the &lt;a class="reference external" href="https://eli.thegreenplace.net/2017/benefits-of-dependencies-in-software-projects-as-a-function-of-effort/"&gt;benefits of dependencies in software projects as a
function of effort&lt;/a&gt;.
Using parser generators is a heavy &lt;em&gt;conceptual&lt;/em&gt; dependency: it's really nice
when you have to churn out many parsers for small languages. But when you have
to maintain a single, very complex parser, as part of a large project - the
benefits quickly dissipate and you're left with a substantial dependency that
you constantly grapple with.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-other-issue-with-dependencies"&gt;
&lt;h3&gt;The other issue with dependencies&lt;/h3&gt;
&lt;p&gt;And then there are the usual problems with dependencies; dependencies get
abandoned, and they may also develop security issues. Sometimes, both of these
become true.&lt;/p&gt;
&lt;p&gt;Many years ago, pycparser forked and started vendoring its own version of PLY.
This was part of transitioning pycparser to a dual Python 2/3 code base when PLY
was slower to adapt. I believe this was the right decision, since PLY &amp;quot;just
worked&amp;quot; and I didn't have to deal with active (and very tedious in the Python
ecosystem, where packaging tools are replaced faster than dirty socks)
dependency management.&lt;/p&gt;
&lt;p&gt;A couple of weeks ago &lt;a class="reference external" href="https://github.com/eliben/pycparser/issues/588"&gt;this issue&lt;/a&gt;
was opened for pycparser. It turns out the some old PLY code triggers security
checks used by some Linux distributions; while this code was fixed in a later
commit of PLY, PLY itself was apparently abandoned and archived in late 2025.
And guess what? That happened in the middle of a large rewrite of the package,
so re-vendoring the pre-archiving commit seemed like a risky proposition.&lt;/p&gt;
&lt;p&gt;On the issue it was suggested that &amp;quot;hopefully the dependent packages move on to
a non-abandoned parser or implement their own&amp;quot;; I originally laughed this idea
off, but then it got me thinking... which is what this post is all about.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="growing-complexity-of-parsing-a-messy-language"&gt;
&lt;h3&gt;Growing complexity of parsing a messy language&lt;/h3&gt;
&lt;p&gt;The original K&amp;amp;R2 grammar for C99 had - famously - a single shift-reduce
conflict having to do with dangling &lt;tt class="docutils literal"&gt;else&lt;/tt&gt;s belonging to the most recent
&lt;tt class="docutils literal"&gt;if&lt;/tt&gt; statement. And indeed, other than the famous &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Lexer_hack"&gt;lexer hack&lt;/a&gt;
used to deal with &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/05/02/the-context-sensitivity-of-cs-grammar-revisited"&gt;C's type name / ID ambiguity&lt;/a&gt;,
pycparser only had this single shift-reduce conflict.&lt;/p&gt;
&lt;p&gt;But things got more complicated. Over the years, features were added that
weren't strictly in the standard but were supported by all the industrial
compilers. The more advanced C11 and C23 standards weren't beholden to the
promises of conflict-free YACC parsing (since almost no industrial-strength
compilers use YACC at this point), so all caution went out of the window.&lt;/p&gt;
&lt;p&gt;The latest (PLY-based) release of pycparser has many reduce-reduce conflicts
&lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;; these are a severe maintenance hazard because it means the parsing rules
essentially have to be tie-broken by order of appearance in the code. This is
very brittle; pycparser has only managed to maintain its stability and quality
through its comprehensive test suite. Over time, it became harder and harder to
extend, because YACC parsing rules have all kinds of spooky-action-at-a-distance
effects. The straw that broke the camel's back was &lt;a class="reference external" href="https://github.com/eliben/pycparser/pull/590"&gt;this PR&lt;/a&gt; which again proposed to
increase the number of reduce-reduce conflicts &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This - again - prompted me to think &amp;quot;what if I just dump YACC and switch to
a hand-written recursive descent parser&amp;quot;, and here we are.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="the-mental-roadblock"&gt;
&lt;h2&gt;The mental roadblock&lt;/h2&gt;
&lt;p&gt;None of the challenges described above are new; I've been pondering them for
many years now, and yet biting the bullet and rewriting the parser didn't feel
like something I'd like to get into. By my private estimates it'd take at least
a week of deep heads-down work to port the gritty 2000 lines of YACC grammar
rules to a recursive descent parser &lt;a class="footnote-reference" href="#footnote-4" id="footnote-reference-4"&gt;[4]&lt;/a&gt;. Moreover, it wouldn't be a
particularly &lt;em&gt;fun&lt;/em&gt; project either - I didn't feel like I'd learn much new and
my interests have shifted away from this project. In short, the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Potential_well"&gt;Potential well&lt;/a&gt; was just too deep.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="why-would-this-even-work-tests"&gt;
&lt;h2&gt;Why would this even work? Tests&lt;/h2&gt;
&lt;p&gt;I've definitely noticed the improvement in capabilities of LLM coding
agents in the past few months, and many reputable people online rave about using
them for increasingly larger projects. That said, would an LLM agent really be
able to accomplish such a complex project on its own? This isn't just a toy,
it's thousands of lines of dense parsing code.&lt;/p&gt;
&lt;p&gt;What gave me hope is the concept of &lt;a class="reference external" href="https://simonwillison.net/2025/Dec/31/the-year-in-llms/#the-year-of-conformance-suites"&gt;conformance suites mentioned by
Simon Willison&lt;/a&gt;.
Agents seem to do well when there's a very clear and rigid
goal function - such as a large, high-coverage conformance test suite.&lt;/p&gt;
&lt;p&gt;And pycparser has an &lt;a class="reference external" href="https://github.com/eliben/pycparser/blob/main/tests/test_c_parser.py"&gt;very extensive one&lt;/a&gt;.
Over 2500 lines of test code parsing various C snippets to ASTs with expected
results, grown over a decade and a half of real issues and bugs reported by
users.&lt;/p&gt;
&lt;p&gt;I figured the LLM can either succeed or fail and throw its hands up in despair,
but it's quite unlikely to produce a &lt;em&gt;wrong&lt;/em&gt; port that would still pass all
the tests. So I set it to run.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-initial-port"&gt;
&lt;h2&gt;The initial port&lt;/h2&gt;
&lt;p&gt;I fired up Codex in pycparser's repository, and wrote this prompt just to
make sure it understands me and can run the tests:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;This repository implements a Python parser producing an AST. It has a very
extensive test suite; can you first verify you successfully run all the tests?
(with python3 -m unittest discover, from the root dir)
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Codex figured it out (I gave it the exact command, after all!); my next
prompt was the real thing &lt;a class="footnote-reference" href="#footnote-5" id="footnote-reference-5"&gt;[5]&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;pycparser uses the PLY library for its lexer and parser, and its own AST
package for the ASTs created by the parser. I want to explore replacing the
Parser component (leaving the Lexer alone for now) with a hand written
recursive descent parser that will pass all the same tests. Can you try that?
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here Codex went to work and churned &lt;em&gt;for over an hour&lt;/em&gt;. Having never
observed an agent work for nearly this long, I kind of assumed it went off
the rails and will fail sooner or later. So I was rather surprised and skeptical
when it eventually came back with:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;Replaced the PLY parse path with a handwritten recursive‑descent parser while
keeping the existing lexer and AST helpers intact; CParser now points to the
new RD implementation, with the old PLY version preserved as PLYCParser in
pycparser/c_parser.py. The RD parser includes a token stream, declarator/type
parsing, expression precedence handling, pragmas, and statement parsing to
match the previous AST output and coordinate behavior.
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It took me a while to poke around the code and run it until I was convinced - it
had actually done it! It wrote a new recursive descent parser with only
ancillary dependencies on PLY, and that parser passed the test suite. After a
few more prompts, we've removed the ancillary dependencies and made the
structure clearer. I hadn't looked too deeply into code quality at this point,
but at least on the functional level - it succeeded. This was very impressive!&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="a-quick-note-on-reviews-and-branches"&gt;
&lt;h2&gt;A quick note on reviews and branches&lt;/h2&gt;
&lt;p&gt;A change like the one described above is impossible to code-review as one PR in
any meaningful way; so I used a different strategy. Before embarking on this
path, I created a new branch and once Codex finished the initial rewrite, I
committed this change, knowing that I will review it in detail, piece-by-piece
later on.&lt;/p&gt;
&lt;p&gt;Even though coding agents have their own notion of history and can &amp;quot;revert&amp;quot;
certain changes, I felt much safer relying on Git. In the worst case if all of
this goes south, I can nuke the branch and it's as if nothing ever happened.
I was determined to only merge this branch onto &lt;tt class="docutils literal"&gt;main&lt;/tt&gt; once I was fully
satisfied with the code. In what follows, I had to &lt;tt class="docutils literal"&gt;git reset&lt;/tt&gt; several times
when I didn't like the direction in which Codex was going. In hindsight, doing
this work in a branch was absolutely the right choice.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-long-tail-of-goofs"&gt;
&lt;h2&gt;The long tail of goofs&lt;/h2&gt;
&lt;p&gt;Once I've sufficiently convinced myself that the new parser is actually working,
I used Codex to similarly rewrite the lexer and get rid of the PLY dependency
entirely, deleting it from the repository. Then, I started looking more deeply
into code quality - reading the code created by Codex and trying to wrap my head
around it.&lt;/p&gt;
&lt;p&gt;And - oh my - this was quite the journey. Much has been written about the code
produced by agents, and much of it seems to be true. Maybe it's a setting I'm
missing (I'm not using my own custom &lt;tt class="docutils literal"&gt;AGENTS.md&lt;/tt&gt; yet, for instance), but
Codex seems to be that eager programmer that wants to get from A to B whatever
the cost. Readability, minimalism and code clarity are very much secondary
goals.&lt;/p&gt;
&lt;p&gt;Using &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;raise...except&lt;/span&gt;&lt;/tt&gt; for control flow? Yep. Abusing Python's weak typing
(like having &lt;tt class="docutils literal"&gt;None&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;false&lt;/tt&gt; and other values all mean different things
for a given variable)? For sure. Spreading the logic of a complex function
all over the place instead of putting all the key parts in a single switch
statement? You bet.&lt;/p&gt;
&lt;p&gt;Moreover, the agent is hilariously &lt;em&gt;lazy&lt;/em&gt;. More than once I had to convince it
to do something it initially said is impossible, and even insisted again in
follow-up messages. The anthropomorphization here is mildly concerning, to be
honest. I could never imagine I would be writing something like the following to
a computer, and yet - here we are: &amp;quot;Remember how we moved X to Y before? You
can do it again for Z, definitely. Just try&amp;quot;.&lt;/p&gt;
&lt;p&gt;My process was to see how I can instruct Codex to fix things, and intervene
myself (by rewriting code) as little as possible. I've &lt;em&gt;mostly&lt;/em&gt; succeeded in
this, and did maybe 20% of the work myself.&lt;/p&gt;
&lt;p&gt;My branch grew &lt;em&gt;dozens&lt;/em&gt; of commits, falling into roughly these categories:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;The code in X is too complex; why can't we do Y instead?&lt;/li&gt;
&lt;li&gt;The use of X is needlessly convoluted; change Y to Z, and T to V in all
instances.&lt;/li&gt;
&lt;li&gt;The code in X is unclear; please add a detailed comment - with examples - to
explain what it does.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Interestingly, after doing (3), the agent was often more effective in giving
the code a &amp;quot;fresh look&amp;quot; and succeeding in either (1) or (2).&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-end-result"&gt;
&lt;h2&gt;The end result&lt;/h2&gt;
&lt;p&gt;Eventually, after many hours spent in this process, I was reasonably pleased
with the code. It's far from perfect, of course, but taking the essential
complexities into account, it's something I could see myself maintaining (with
or without the help of an agent). I'm sure I'll find more ways to improve it
in the future, but I have a reasonable degree of confidence that this will be
doable.&lt;/p&gt;
&lt;p&gt;It passes all the tests, so I've been able to release a new version (3.00)
without major issues so far. The only issue I've discovered is that some of
CFFI's tests are overly precise about the phrasing of errors reported by
pycparser; this was &lt;a class="reference external" href="https://github.com/python-cffi/cffi/pull/224"&gt;an easy fix&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The new parser is also faster, by about 30% based on my benchmarks! This is
typical of recursive descent when compared with YACC-generated parsers, in my
experience. After reviewing the initial rewrite of the lexer, I've spent a while
instructing Codex on how to make it faster, and it worked reasonably well.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="followup-static-typing"&gt;
&lt;h2&gt;Followup - static typing&lt;/h2&gt;
&lt;p&gt;While working on this, it became quite obvious that static typing would make the
process easier. LLM coding agents really benefit from closed loops with strict
guardrails (e.g. a test suite to pass), and type-annotations act as such.
For example, had pycparser already been type annotated, Codex would probably not
have overloaded values to multiple types (like &lt;tt class="docutils literal"&gt;None&lt;/tt&gt; vs. &lt;tt class="docutils literal"&gt;False&lt;/tt&gt; vs.
others).&lt;/p&gt;
&lt;p&gt;In a followup, I asked Codex to type-annotate pycparser (running checks using
&lt;tt class="docutils literal"&gt;ty&lt;/tt&gt;), and this was also a back-and-forth because the process exposed some
issues that needed to be refactored. Time will tell, but hopefully it will make
further changes in the project simpler for the agent.&lt;/p&gt;
&lt;p&gt;Based on this experience, I'd bet that coding agents will be somewhat more
effective in strongly typed languages like Go, TypeScript and especially Rust.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusions"&gt;
&lt;h2&gt;Conclusions&lt;/h2&gt;
&lt;p&gt;Overall, this project has been a really good experience, and I'm impressed with
what modern LLM coding agents can do! While there's no reason to expect that
progress in this domain will stop, even if it does - these are already very
useful tools that can significantly improve programmer productivity.&lt;/p&gt;
&lt;p&gt;Could I have done this myself, without an agent's help? Sure. But it would have
taken me &lt;em&gt;much&lt;/em&gt; longer, assuming that I could even muster the will and
concentration to engage in this project. I estimate it would take me at least
a week of full-time work (so 30-40 hours) spread over who knows how long to
accomplish. With Codex, I put in an order of magnitude less work into this
(around 4-5 hours, I'd estimate) and I'm happy with the result.&lt;/p&gt;
&lt;p&gt;It was also &lt;em&gt;fun&lt;/em&gt;. At least in one sense, my professional life can be described
as the pursuit of focus, deep work and &lt;em&gt;flow&lt;/em&gt;. It's not easy for me to get into
this state, but when I do I'm highly productive and find it very enjoyable.
Agents really help me here. When I know I need to write some code and it's
hard to get started, asking an agent to write a prototype is a great catalyst
for my motivation. Hence the meme at the beginning of the post.&lt;/p&gt;
&lt;div class="section" id="does-code-quality-even-matter"&gt;
&lt;h3&gt;Does code quality even matter?&lt;/h3&gt;
&lt;p&gt;One can't avoid a nagging question - does the quality of the code produced
by agents even matter? Clearly, the agents themselves can understand it (if not
today's agent, then at least next year's). Why worry about future
maintainability if the agent can maintain it? In other words, does it make sense
to just go full vibe-coding?&lt;/p&gt;
&lt;p&gt;This is a fair question, and one I don't have an answer to. Right now, for
projects I maintain and &lt;em&gt;stand behind&lt;/em&gt;, it seems obvious to me that the code
should be fully understandable and accepted by me, and the agent is just a tool
helping me get to that state more efficiently. It's hard to say what the future
holds here; it's going to interesting, for sure.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;pycparser has a fair number of &lt;a class="reference external" href="https://deps.dev/pypi/pycparser/3.0.0/dependents"&gt;direct dependents&lt;/a&gt;,
but the majority of downloads comes through &lt;a class="reference external" href="https://github.com/python-cffi/cffi"&gt;CFFI&lt;/a&gt;,
which itself is a major building block for much of the Python ecosystem.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;The table-building report says 177, but that's certainly an
over-dramatization because it's common for a single conflict to
manifest in several ways.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;It didn't help the PR's case that it was almost certainly vibe coded.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-4" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-4"&gt;[4]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;p class="first"&gt;There was also the lexer to consider, but this seemed like a much
simpler job. My impression is that in the early days of computing,
&lt;tt class="docutils literal"&gt;lex&lt;/tt&gt; gained prominence because of strong regexp support which wasn't
very common yet. These days, with excellent regexp libraries
existing for pretty much every language, the added value of &lt;tt class="docutils literal"&gt;lex&lt;/tt&gt; over
a &lt;a class="reference external" href="https://eli.thegreenplace.net/2013/06/25/regex-based-lexical-analysis-in-python-and-javascript"&gt;custom regexp-based lexer&lt;/a&gt;
isn't very high.&lt;/p&gt;
&lt;p class="last"&gt;That said, it wouldn't make much sense to embark on a journey to rewrite
&lt;em&gt;just&lt;/em&gt; the lexer; the dependency on PLY would still remain, and besides,
PLY's lexer and parser are designed to work well together. So it wouldn't
help me much without tackling the parser beast.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-5" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-5"&gt;[5]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;I've decided to ask it to the port the parser first, leaving the lexer
alone. This was to split the work into reasonable chunks. Besides, I
figured that the parser is the hard job anyway - if it succeeds in that,
the lexer should be easy. That assumption turned out to be correct.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Python"></category><category term="Machine Learning"></category><category term="Compilation"></category><category term="Recursive descent parsing"></category></entry><entry><title>Revisiting "Let's Build a Compiler"</title><link href="https://eli.thegreenplace.net/2025/revisiting-lets-build-a-compiler/" rel="alternate"></link><published>2025-12-09T20:40:00-08:00</published><updated>2026-01-17T22:40:40-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2025-12-09:/2025/revisiting-lets-build-a-compiler/</id><summary type="html">&lt;p&gt;There's an old compiler-building tutorial that has become part of the field's
lore: the &lt;a class="reference external" href="https://compilers.iecc.com/crenshaw/"&gt;Let's Build a Compiler&lt;/a&gt;
series by Jack Crenshaw (published between 1988 and 1995).&lt;/p&gt;
&lt;p&gt;I &lt;a class="reference external" href="https://eli.thegreenplace.net/2003/07/29/great-compilers-tutorial"&gt;ran into it in 2003&lt;/a&gt;
and was very impressed, but it's now 2025 and this tutorial is still being mentioned quite
often …&lt;/p&gt;</summary><content type="html">&lt;p&gt;There's an old compiler-building tutorial that has become part of the field's
lore: the &lt;a class="reference external" href="https://compilers.iecc.com/crenshaw/"&gt;Let's Build a Compiler&lt;/a&gt;
series by Jack Crenshaw (published between 1988 and 1995).&lt;/p&gt;
&lt;p&gt;I &lt;a class="reference external" href="https://eli.thegreenplace.net/2003/07/29/great-compilers-tutorial"&gt;ran into it in 2003&lt;/a&gt;
and was very impressed, but it's now 2025 and this tutorial is still being mentioned quite
often &lt;a class="reference external" href="https://hn.algolia.com/?dateRange=pastYear&amp;amp;page=0&amp;amp;prefix=true&amp;amp;query=crenshaw&amp;amp;sort=byDate&amp;amp;type=all"&gt;in Hacker News threads&lt;/a&gt;.
Why is that? Why does a tutorial from 35
years ago, built in Pascal and emitting Motorola 68000 assembly - technologies that
are virtually unknown for the new generation of programmers - hold sway over
compiler enthusiasts? I've decided to find out.&lt;/p&gt;
&lt;p&gt;The tutorial is &lt;a class="reference external" href="https://compilers.iecc.com/crenshaw/"&gt;easily available and readable online&lt;/a&gt;, but
just re-reading it seemed insufficient. So I've decided on meticulously
translating the compilers built in it to Python and emit a more modern target -
WebAssembly. It was an enjoyable process and I want to share the outcome and
some insights gained along the way.&lt;/p&gt;
&lt;p&gt;The result is &lt;a class="reference external" href="https://github.com/eliben/letsbuildacompiler"&gt;this code repository&lt;/a&gt;.
Of particular interest is the &lt;a class="reference external" href="https://github.com/eliben/letsbuildacompiler/blob/main/TUTORIAL.md"&gt;TUTORIAL.md file&lt;/a&gt;,
which describes how each part in the original tutorial is mapped to my code. So
if you want to read the original tutorial but play with code you can actually
easily try on your own, feel free to follow my path.&lt;/p&gt;
&lt;div class="section" id="a-sample"&gt;
&lt;h2&gt;A sample&lt;/h2&gt;
&lt;p&gt;To get a taste of the input language being compiled and the output my compiler
generates, here's a sample program in the KISS language designed by Jack
Crenshaw:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;var X=0

 { sum from 0 to n-1 inclusive, and add to result }
 procedure addseq(n, ref result)
     var i, sum  { 0 initialized }
     while i &amp;lt; n
         sum = sum + i
         i = i + 1
     end
     result = result + sum
 end

 program testprog
 begin
     addseq(11, X)
 end
 .
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It's from part 13 of the tutorial, so it showcases procedures along with control
constructs like the &lt;tt class="docutils literal"&gt;while&lt;/tt&gt; loop, and passing parameters both by value and by
reference. Here's the WASM text generated by my compiler for part 13:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;module&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;memory&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;;; Linear stack pointer. Used to pass parameters by ref.&lt;/span&gt;
  &lt;span class="c1"&gt;;; Grows downwards (towards lower addresses).&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mf"&gt;65536&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="nv"&gt;$X&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$ADDSEQ&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="nv"&gt;$N&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="nv"&gt;$RESULT&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;local&lt;/span&gt; &lt;span class="nv"&gt;$I&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;local&lt;/span&gt; &lt;span class="nv"&gt;$SUM&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;loop&lt;/span&gt; &lt;span class="nv"&gt;$loop1&lt;/span&gt;
      &lt;span class="k"&gt;block&lt;/span&gt; &lt;span class="nv"&gt;$breakloop1&lt;/span&gt;
        &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$I&lt;/span&gt;
        &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$N&lt;/span&gt;
        &lt;span class="nb"&gt;i32.lt_s&lt;/span&gt;
        &lt;span class="nb"&gt;i32.eqz&lt;/span&gt;
        &lt;span class="nb"&gt;br_if&lt;/span&gt; &lt;span class="nv"&gt;$breakloop1&lt;/span&gt;
        &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$SUM&lt;/span&gt;
        &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$I&lt;/span&gt;
        &lt;span class="nb"&gt;i32.add&lt;/span&gt;
        &lt;span class="nb"&gt;local.set&lt;/span&gt; &lt;span class="nv"&gt;$SUM&lt;/span&gt;
        &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$I&lt;/span&gt;
        &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="nb"&gt;i32.add&lt;/span&gt;
        &lt;span class="nb"&gt;local.set&lt;/span&gt; &lt;span class="nv"&gt;$I&lt;/span&gt;
        &lt;span class="nb"&gt;br&lt;/span&gt; &lt;span class="nv"&gt;$loop1&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$RESULT&lt;/span&gt;
    &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$RESULT&lt;/span&gt;
    &lt;span class="nb"&gt;i32.load&lt;/span&gt;
    &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="nv"&gt;$SUM&lt;/span&gt;
    &lt;span class="nb"&gt;i32.add&lt;/span&gt;
    &lt;span class="nb"&gt;i32.store&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$main&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;main&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;11&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;      &lt;span class="c1"&gt;;; make space on stack&lt;/span&gt;
    &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="nb"&gt;i32.sub&lt;/span&gt;
    &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$X&lt;/span&gt;
    &lt;span class="nb"&gt;i32.store&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;    &lt;span class="c1"&gt;;; push address as parameter&lt;/span&gt;
    &lt;span class="nb"&gt;call&lt;/span&gt; &lt;span class="nv"&gt;$ADDSEQ&lt;/span&gt;
    &lt;span class="c1"&gt;;; restore parameter X by ref&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;
    &lt;span class="nb"&gt;i32.load&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$X&lt;/span&gt;
    &lt;span class="c1"&gt;;; clean up stack for ref parameters&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;
    &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;
    &lt;span class="nb"&gt;i32.add&lt;/span&gt;
    &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$__sp&lt;/span&gt;
    &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$X&lt;/span&gt;
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You'll notice that there is some trickiness in the emitted code w.r.t. handling
the by-reference parameter (my &lt;a class="reference external" href="https://eli.thegreenplace.net/2025/notes-on-the-wasm-basic-c-abi/"&gt;previous post&lt;/a&gt;
deals with this issue in more detail). In general, though, the emitted code is
inefficient - there is close to 0 optimization applied.&lt;/p&gt;
&lt;p&gt;Also, if you're very diligent you'll notice something odd about the global
variable &lt;tt class="docutils literal"&gt;X&lt;/tt&gt; - it seems to be implicitly returned by the generated &lt;tt class="docutils literal"&gt;main&lt;/tt&gt;
function. This is just a testing facility that makes my compiler easy to test.
All the compilers are extensively tested - usually by running the
generated WASM code &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt; and verifying expected results.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="insights-what-makes-this-tutorial-so-special"&gt;
&lt;h2&gt;Insights - what makes this tutorial so special?&lt;/h2&gt;
&lt;p&gt;While reading the original tutorial again, I had on opportunity to reminisce on
what makes it so effective. Other than the very fluent and conversational
writing style of Jack Crenshaw, I think it's a combination of two key
factors:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;The tutorial builds a recursive-descent parser step by step, rather than
giving a long preface on automata and table-based parser generators. When
I first encountered it (in 2003), it was taken for granted that if you want
to write a parser then lex + yacc are the way to go &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;. Following the
development of a simple and clean hand-written
parser was a revelation that wholly changed my approach to the subject;
subsequently, hand-written recursive-descent parsers have been my go-to approach
&lt;a class="reference external" href="https://eli.thegreenplace.net/tag/recursive-descent-parsing"&gt;for almost 20 years now&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Rather than getting stuck in front-end minutiae, the tutorial goes straight
to generating working assembly code, from very early on. This was also a
breath of fresh air for engineers who grew up with more traditional courses
where you spend 90% of the time on parsing, type checking and other semantic
analysis and often run entirely out of steam by the time code generation
is taught.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;To be honest, I don't think either of these are a big problem with modern
resources, but back in the day the tutorial clearly hit the right nerve with
many people.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="what-else-does-it-teach-us"&gt;
&lt;h2&gt;What else does it teach us?&lt;/h2&gt;
&lt;p&gt;Jack Crenshaw's tutorial takes the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Syntax-directed_translation"&gt;syntax-directed translation&lt;/a&gt;
approach, where code is emitted &lt;em&gt;while parsing&lt;/em&gt;, without having to divide the
compiler into explicit phases with IRs. As I said above, this is a fantastic
approach for getting started, but in the latter parts of the tutorial it starts
showing its limitations. Especially once we get to types, it becomes painfully
obvious that it would be very nice if we knew the types of expressions &lt;em&gt;before&lt;/em&gt;
we generate code for them.&lt;/p&gt;
&lt;p&gt;I don't know if this is implicated in Jack Crenshaw's abandoning the tutorial
at some point after part 14, but it may very well be. He keeps writing how
the emitted code is clearly sub-optimal &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt; and can be improved, but IMHO it's
just not that easy to improve using the syntax-directed translation strategy.
With perfect hindsight vision, I would probably use Part 14 (types) as a turning
point - emitting some kind of AST from the parser and then doing simple type
checking and analysis on that AST prior to generating code from it.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;All in all, the original tutorial remains a wonderfully readable introduction
to building compilers. This post and the &lt;a class="reference external" href="https://github.com/eliben/letsbuildacompiler"&gt;GitHub repository&lt;/a&gt;
it describes are a modest
contribution that aims to improve the experience of folks reading the original
tutorial today and not willing to use obsolete technologies. As always, let
me know if you run into any issues or have questions!&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;This is done using the &lt;a class="reference external" href="https://pypi.org/project/wasmtime/"&gt;Python bindings to wasmtime&lt;/a&gt;.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;By the way, gcc switched from YACC to hand-written recursive-descent
parsing in the 2004-2006 timeframe, and Clang has been implemented with
a recursive-descent parser from the start (2007).&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;p class="first"&gt;Concretely: when we compile &lt;tt class="docutils literal"&gt;subexpr1 + subexpr2&lt;/tt&gt; and the two sides have different
types, it would be mighty nice to know that &lt;em&gt;before&lt;/em&gt; we actually generate
the code for both sub-expressions. But the syntax-directed translation
approach just doesn't work that way.&lt;/p&gt;
&lt;p class="last"&gt;To be clear: it's easy to generate &lt;em&gt;working&lt;/em&gt; code; it's just not easy
to generate optimal code without some sort of type analysis that's
done before code is actually generated.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="Compilation"></category><category term="WebAssembly"></category><category term="Python"></category><category term="Recursive descent parsing"></category></entry><entry><title>Notes on the WASM Basic C ABI</title><link href="https://eli.thegreenplace.net/2025/notes-on-the-wasm-basic-c-abi/" rel="alternate"></link><published>2025-11-24T19:47:00-08:00</published><updated>2025-11-25T03:49:52-08:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2025-11-24:/2025/notes-on-the-wasm-basic-c-abi/</id><summary type="html">&lt;p&gt;The &lt;a class="reference external" href="https://github.com/WebAssembly/tool-conventions/tree/main"&gt;WebAssembly/tool-conventions&lt;/a&gt;
repository contains &amp;quot;Conventions supporting interoperability between tools
working with WebAssembly&amp;quot;.&lt;/p&gt;
&lt;p&gt;Of special interest, in contains the &lt;a class="reference external" href="https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md"&gt;Basic C ABI&lt;/a&gt; - an ABI
for representing C programs in WASM. This ABI is followed by compilers like Clang
with the &lt;tt class="docutils literal"&gt;wasm32&lt;/tt&gt; target. Rust is &lt;a class="reference external" href="https://blog.rust-lang.org/2025/04/04/c-abi-changes-for-wasm32-unknown-unknown/"&gt;also switching to this ABI&lt;/a&gt;
for …&lt;/p&gt;</summary><content type="html">&lt;p&gt;The &lt;a class="reference external" href="https://github.com/WebAssembly/tool-conventions/tree/main"&gt;WebAssembly/tool-conventions&lt;/a&gt;
repository contains &amp;quot;Conventions supporting interoperability between tools
working with WebAssembly&amp;quot;.&lt;/p&gt;
&lt;p&gt;Of special interest, in contains the &lt;a class="reference external" href="https://github.com/WebAssembly/tool-conventions/blob/main/BasicCABI.md"&gt;Basic C ABI&lt;/a&gt; - an ABI
for representing C programs in WASM. This ABI is followed by compilers like Clang
with the &lt;tt class="docutils literal"&gt;wasm32&lt;/tt&gt; target. Rust is &lt;a class="reference external" href="https://blog.rust-lang.org/2025/04/04/c-abi-changes-for-wasm32-unknown-unknown/"&gt;also switching to this ABI&lt;/a&gt;
for &lt;tt class="docutils literal"&gt;extern &amp;quot;C&amp;quot;&lt;/tt&gt; code.&lt;/p&gt;
&lt;p&gt;This post contains some notes on this ABI, with annotated code samples and
diagrams to help visualize what the emitted WASM code is doing. Hereafter, &amp;quot;the
ABI&amp;quot; refers to this Basic C ABI.&lt;/p&gt;
&lt;div class="section" id="preface-the-wasm-value-stack-and-linear-memory"&gt;
&lt;h2&gt;Preface: the WASM value stack and linear memory&lt;/h2&gt;
&lt;p&gt;In these notes, annotated WASM snippets often contain descriptions of the state
of the WASM value stack at a given point in time. Unless otherwise specified,
&amp;quot;TOS&amp;quot; refers to &amp;quot;Top Of value Stack&amp;quot;, and the notation &lt;tt class="docutils literal"&gt;[ x&amp;nbsp; y ]&lt;/tt&gt; means the
stack has &lt;tt class="docutils literal"&gt;y&lt;/tt&gt; on top, with &lt;tt class="docutils literal"&gt;x&lt;/tt&gt; right under it (and possibly some other
stuff that's not relevant to the discussion under &lt;tt class="docutils literal"&gt;x&lt;/tt&gt;); in this notation,
the stack grows &amp;quot;to the right&amp;quot;.&lt;/p&gt;
&lt;p&gt;The WASM value stack has no linear memory representation and cannot be
addressed, so it's meaningless to discuss whether the stack grows towards lower
or higher addresses. The value stack is simply an abstract stack, where values
can be pushed onto or popped off its &amp;quot;top&amp;quot;.&lt;/p&gt;
&lt;p&gt;Whenever addressing is required, the ABI specifies explicitly managing a
separate stack in linear memory. This stack is very similar to how stacks are
managed in hardware assembly languages (except that in the ABI this stack
pointer is held in a global variable, and is not a special register), and it's
called the &amp;quot;linear stack&amp;quot;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="scalar-parameters-and-returns"&gt;
&lt;h2&gt;Scalar parameters and returns&lt;/h2&gt;
&lt;p&gt;By &amp;quot;scalar&amp;quot; I mean basic C types like &lt;tt class="docutils literal"&gt;int&lt;/tt&gt;, &lt;tt class="docutils literal"&gt;double&lt;/tt&gt; or &lt;tt class="docutils literal"&gt;char&lt;/tt&gt;. For
these, using the WASM value stack is sufficient, since WASM functions can accept
an arbitrary number of scalar parameters.&lt;/p&gt;
&lt;p&gt;This C function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;add_three&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Will be compiled into something like:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$add_three&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;     &lt;span class="c1"&gt;;; [ y ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;     &lt;span class="c1"&gt;;; [ y  x ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;         &lt;span class="c1"&gt;;; [ x+y ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;     &lt;span class="c1"&gt;;; [ x+y  z ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;         &lt;span class="c1"&gt;;; [ x+y+z ]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And can be called by pushing three values onto the stack and invoking
&lt;tt class="docutils literal"&gt;call $add_three&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;The ABI specifies that all integral types 32-bit and smaller will be passed
as &lt;tt class="docutils literal"&gt;i32&lt;/tt&gt;, with the smaller types appropriately sign or zero extended. For
example, consider this C function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;add_three_chars&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;char&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;It's compiled to the almost same code as &lt;tt class="docutils literal"&gt;add_three&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$add_three_chars&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;
  &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="err"&gt;.ext&lt;/span&gt;&lt;span class="k"&gt;end&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="err"&gt;_s&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Except the last &lt;tt class="docutils literal"&gt;i32.extend8_s&lt;/tt&gt;, which takes the lowest 8 bits of the value
on TOS and sign-extends them to the full &lt;tt class="docutils literal"&gt;i32&lt;/tt&gt; (effectively ignoring all
the higher bits). Similarly, when &lt;tt class="docutils literal"&gt;$add_three_chars&lt;/tt&gt; is called, each of its
parameters goes through &lt;tt class="docutils literal"&gt;i32.extend8_s&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;There are additional oddities that we won't get deep into, like passing
&lt;tt class="docutils literal"&gt;__int128&lt;/tt&gt; values via two &lt;tt class="docutils literal"&gt;i64&lt;/tt&gt; parameters.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="pointers"&gt;
&lt;h2&gt;Pointers&lt;/h2&gt;
&lt;p&gt;C pointers are just scalars, but it's still educational to review how
they are handled in the ABI. Pointers to any type are passed in &lt;tt class="docutils literal"&gt;i32&lt;/tt&gt; values;
the compiler knows they are pointers, though, and emits the appropriate
instructions. For example:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;add_indirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Is compiled to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$add_indirect&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;         &lt;span class="c1"&gt;;; [ ptr_sum ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;         &lt;span class="c1"&gt;;; [ ptr_sum  y ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;         &lt;span class="c1"&gt;;; [ ptr_sum  y  x ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;             &lt;span class="c1"&gt;;; [ ptr_sum  x+y ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.tee&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;         &lt;span class="c1"&gt;;; x &amp;lt;- x+y, leaving stack intact&lt;/span&gt;
  &lt;span class="nb"&gt;i32.store&lt;/span&gt;           &lt;span class="c1"&gt;;; store x+y into *ptr_sum&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;         &lt;span class="c1"&gt;;; [ x+y ] for returning&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Recall that in WASM, there's no difference between an &lt;tt class="docutils literal"&gt;i32&lt;/tt&gt; representing an
address in linear memory and an &lt;tt class="docutils literal"&gt;i32&lt;/tt&gt; representing just a number.
&lt;tt class="docutils literal"&gt;i32.store&lt;/tt&gt; expects &lt;tt class="docutils literal"&gt;[ addr&amp;nbsp; value ]&lt;/tt&gt; on TOS, and does &lt;tt class="docutils literal"&gt;*addr = value&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;Note that the &lt;tt class="docutils literal"&gt;x&lt;/tt&gt; parameter isn't needed any longer after the sum is
computed, so it's reused later on to hold the return value. WASM parameters are
treated just like other locals (as in C).&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="passing-parameters-through-linear-memory"&gt;
&lt;h2&gt;Passing parameters through linear memory&lt;/h2&gt;
&lt;p&gt;According to the ABI, while scalars and single-element structs or unions are
passed to a callee via WASM function parameters (as shown above), for larger
aggregates the compiler utilizes linear memory.&lt;/p&gt;
&lt;p&gt;Specifically, each function gets a &amp;quot;frame&amp;quot; in a region of linear memory
allocated for the linear stack. This region grows downwards from high
to low addresses &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;, and the global &lt;tt class="docutils literal"&gt;$__stack_pointer&lt;/tt&gt; points at the bottom
of the frame:&lt;/p&gt;
&lt;img alt="WASM C ABI linear stack" class="align-center" src="https://eli.thegreenplace.net/images/2025/wasm-c-abi-linear-stack.png" /&gt;
&lt;p&gt;Consider this code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;noinline&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pair_calculate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pair&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;do_work&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pair_calculate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;When &lt;tt class="docutils literal"&gt;do_work&lt;/tt&gt; is compiled to WASM, prior to calling &lt;tt class="docutils literal"&gt;pair_calculate&lt;/tt&gt; it
copies &lt;tt class="docutils literal"&gt;pp&lt;/tt&gt; into a location in linear memory, and passes the address of this
location to &lt;tt class="docutils literal"&gt;pair_calculate&lt;/tt&gt;. This location is on the linear stack, which
is maintained using the &lt;tt class="docutils literal"&gt;$__stack_pointer&lt;/tt&gt; global. Here's the compiled WASM
for &lt;tt class="docutils literal"&gt;do_work&lt;/tt&gt; (I also gave its local variable a meaningful name, for
readability):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$do_work&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;local&lt;/span&gt; &lt;span class="nv"&gt;$sp&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__stack_pointer&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nb"&gt;i32.sub&lt;/span&gt;
  &lt;span class="nb"&gt;local.tee&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                   &lt;span class="c1"&gt;;; sp &amp;lt;- __stack_pointer - 16&lt;/span&gt;
  &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$__stack_pointer&lt;/span&gt;   &lt;span class="c1"&gt;;; update __stack_pointer as well&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="nb"&gt;i32.store&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;           &lt;span class="c1"&gt;;; mem[sp+12] = y&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nb"&gt;i32.store&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;            &lt;span class="c1"&gt;;; mem[sp+8] = x&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ sp ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ sp  sp ]&lt;/span&gt;

  &lt;span class="c1"&gt;;; Do a 64-bit load from mem[sp+8], this loads the entire pair into&lt;/span&gt;
  &lt;span class="c1"&gt;;; a single i64.&lt;/span&gt;
  &lt;span class="nb"&gt;i64.load&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="k"&gt;align&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
  &lt;span class="nb"&gt;i64.store&lt;/span&gt;                     &lt;span class="c1"&gt;;; mem[sp] = pair-as-i64&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ sp ]&lt;/span&gt;
  &lt;span class="nb"&gt;call&lt;/span&gt; &lt;span class="nv"&gt;$pair_calculate&lt;/span&gt;          &lt;span class="c1"&gt;;; call pair_calculate, passing it sp&lt;/span&gt;
  &lt;span class="nb"&gt;local.set&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;                   &lt;span class="c1"&gt;;; x = result&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ sp ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;                       &lt;span class="c1"&gt;;; [ sp+16 ]&lt;/span&gt;
  &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$__stack_pointer&lt;/span&gt;   &lt;span class="c1"&gt;;; __stack_pointer back to its original value&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ x ] for return&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Some notes about this code:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;There are two instance of the pair &lt;tt class="docutils literal"&gt;pp&lt;/tt&gt; in linear memory prior to the call
to &lt;tt class="docutils literal"&gt;pair_calculate&lt;/tt&gt;: the original one from the initialization statement
(at offset 8), and a copy created for passing into &lt;tt class="docutils literal"&gt;pair_calculate&lt;/tt&gt; (at
offset 0). Theoretically, as &lt;tt class="docutils literal"&gt;pp&lt;/tt&gt; is unused used after the call, the
compiler could do better here and keep only a single copy.&lt;/li&gt;
&lt;li&gt;The stack pointer is decremented by 16, and restored at the end of the
function.&lt;/li&gt;
&lt;li&gt;The first few instructions - where the stack pointer is adjusted - are
usually called the &lt;em&gt;prologue&lt;/em&gt; of the function. In the same vein, the last
few instructions where the stack pointer is reset back to where it was at
the entry are called the &lt;em&gt;epilogue&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Before &lt;tt class="docutils literal"&gt;pair_calculate&lt;/tt&gt; is called, the linear stack looks like this:&lt;/p&gt;
&lt;img alt="WASM C ABI linear stack view at entry to pair_calculate" class="align-center" src="https://eli.thegreenplace.net/images/2025/wasm-c-abi-passing-pair.png" /&gt;
&lt;p&gt;Following the ABI, the code emitted for &lt;tt class="docutils literal"&gt;pair_calculate&lt;/tt&gt; takes &lt;tt class="docutils literal"&gt;Pair*&lt;/tt&gt;
(by reference, instead of by value as the original C code):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$pair_calculate&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;           &lt;span class="c1"&gt;;; [ addr ]&lt;/span&gt;

  &lt;span class="c1"&gt;;; Recall that what&amp;#39;s stored at mem[addr] is a pair, with &amp;#39;x&amp;#39;, followed by &amp;#39;y&amp;#39;.&lt;/span&gt;
  &lt;span class="nb"&gt;i32.load&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;     &lt;span class="c1"&gt;;; [ pair.y ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
  &lt;span class="nb"&gt;i32.mul&lt;/span&gt;               &lt;span class="c1"&gt;;; [ 3*pair.y ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nb"&gt;i32.load&lt;/span&gt;              &lt;span class="c1"&gt;;; [ 3*pair.y  pair.x ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
  &lt;span class="nb"&gt;i32.mul&lt;/span&gt;               &lt;span class="c1"&gt;;; [ 3*pair.y  7*pair.x ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;               &lt;span class="c1"&gt;;; [ 3*pair.y+7*pair.x ]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Each function that needs linear stack space is responsible for adjusting the
stack pointer and restoring it to its original place at the end. This naturally
enables nested function calls; suppose we have some function &lt;tt class="docutils literal"&gt;a&lt;/tt&gt; calling
function &lt;tt class="docutils literal"&gt;b&lt;/tt&gt; which, in turn, calls function &lt;tt class="docutils literal"&gt;c&lt;/tt&gt;, and let's assume all of
these need to allocate space on the linear stack. This is how the linear
stack looks after &lt;tt class="docutils literal"&gt;c&lt;/tt&gt;'s prologue:&lt;/p&gt;
&lt;img alt="WASM C ABI linear stack showing nested frames" class="align-center" src="https://eli.thegreenplace.net/images/2025/wasm-c-abi-nested-frames.png" /&gt;
&lt;p&gt;Since each function knows how much stack space it has allocated, it's able to
properly restore &lt;tt class="docutils literal"&gt;$__stack_pointer&lt;/tt&gt; to the bottom of its caller's frame
before returning.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="returning-values-through-linear-memory"&gt;
&lt;h2&gt;Returning values through linear memory&lt;/h2&gt;
&lt;p&gt;What about returning values of aggregate types? According to the ABI, these
are also handled indirectly; a pointer parameter is &lt;em&gt;prepended&lt;/em&gt; to the
parameter list of the function. The function writes its return value into
this address.&lt;/p&gt;
&lt;p&gt;The following function:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;make_pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;};&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Is compiled to:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;;; Note that the WASM function has three parameters an no return values.&lt;/span&gt;
&lt;span class="c1"&gt;;; The first parameter (local 0) is the address where it should store its&lt;/span&gt;
&lt;span class="c1"&gt;;; return value.&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$make_pair&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;i32.store&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;      &lt;span class="c1"&gt;;; ret_addr[4] = y&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="nb"&gt;i32.store&lt;/span&gt;               &lt;span class="c1"&gt;;; ret_addr[0] = x&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here's a function that calls it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;do_work&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;unsigned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;Pair&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;make_pair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And the corresponding WASM:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="nv"&gt;$do_work&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;param&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;;; local 2 to hold the address for make_pair&amp;#39;s return value.&lt;/span&gt;
  &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;local&lt;/span&gt; &lt;span class="kt"&gt;i32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;global.get&lt;/span&gt; &lt;span class="nv"&gt;$__stack_pointer&lt;/span&gt;   &lt;span class="c1"&gt;;; sp &amp;lt;- __stack_pointer - 16&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nb"&gt;i32.sub&lt;/span&gt;
  &lt;span class="nb"&gt;local.tee&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                   &lt;span class="c1"&gt;;; save sp in local 2&lt;/span&gt;
  &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$__stack_pointer&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;                       &lt;span class="c1"&gt;;; [ sp+8 ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ sp+8  x ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;                   &lt;span class="c1"&gt;;; [ sp+8  x  y ]&lt;/span&gt;

  &lt;span class="c1"&gt;;; make_pair is called with three parameters: the address for where to&lt;/span&gt;
  &lt;span class="c1"&gt;;; store its return pair, x and y.&lt;/span&gt;
  &lt;span class="nb"&gt;call&lt;/span&gt; &lt;span class="nv"&gt;$make_pair&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;i32.load&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
  &lt;span class="nb"&gt;local.set&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;                   &lt;span class="c1"&gt;;; local 1 = pp.x&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;i32.load&lt;/span&gt; &lt;span class="k"&gt;offset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;
  &lt;span class="nb"&gt;local.set&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;                   &lt;span class="c1"&gt;;; local 0 = pp.y&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;
  &lt;span class="nb"&gt;global.set&lt;/span&gt; &lt;span class="nv"&gt;$__stack_pointer&lt;/span&gt;   &lt;span class="c1"&gt;;; __stack_pointer back to its original value&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;
  &lt;span class="nb"&gt;i32.mul&lt;/span&gt;                       &lt;span class="c1"&gt;;; [ 3*pp.y ]&lt;/span&gt;
  &lt;span class="nb"&gt;local.get&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="nb"&gt;i32.const&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;
  &lt;span class="nb"&gt;i32.mul&lt;/span&gt;                       &lt;span class="c1"&gt;;; [ 3*pp.y  7*pp.x ]&lt;/span&gt;
  &lt;span class="nb"&gt;i32.add&lt;/span&gt;                       &lt;span class="c1"&gt;;; [ 3*pp.y+7*pp.x ]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Note that this function only uses 8 bytes of its stack frame, but allocates 16;
this is because the ABI dictates 16-byte alignment for the stack pointer.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="advanced-topics"&gt;
&lt;h2&gt;Advanced topics&lt;/h2&gt;
&lt;p&gt;There are some advanced topics mentioned in the ABI that these notes don't cover
(at least for now), but I'll mention them here for completeness:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&amp;quot;Red zone&amp;quot; - leaf functions have access to 128 bytes of &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64"&gt;red zone&lt;/a&gt;
below the
stack pointer. I found this difficult to observe in practice &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.
Since we don't issue system calls directly
in WASM, it's tricky to conjure a realistic leaf function that requires
the linear stack (instead of just using WASM locals).&lt;/li&gt;
&lt;li&gt;A separate frame pointer (global value) to be used for functions that require
dynamic stack allocation (such as using &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Variable-length_array"&gt;C's VLAs&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;A separate base pointer to be used for functions that require
alignment &amp;gt; 16 bytes on the stack.&lt;/li&gt;
&lt;/ul&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;p class="first"&gt;This is similar &lt;a class="reference external" href="https://eli.thegreenplace.net/2011/02/04/where-the-top-of-the-stack-is-on-x86/"&gt;to x86&lt;/a&gt;.
For the WASM C ABI, a good reason is provided for the direction: WASM
load and store instructions have an &lt;em&gt;unsigned&lt;/em&gt; constant called
&lt;tt class="docutils literal"&gt;offset&lt;/tt&gt; that can be used to add a positive offset to the address
parameter without extra instructions.&lt;/p&gt;
&lt;p class="last"&gt;Since &lt;tt class="docutils literal"&gt;$__stack_pointer&lt;/tt&gt; points to the lowest address in the frame,
these offsets can be used to efficiently access any value on the stack.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;If you have a nice example showing it using Clang/LLVM, please drop me
a note!&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="WebAssembly"></category><category term="C &amp; C++"></category><category term="Compilation"></category></entry><entry><title>Implementing Forth in Go and C</title><link href="https://eli.thegreenplace.net/2025/implementing-forth-in-go-and-c/" rel="alternate"></link><published>2025-08-26T20:38:00-07:00</published><updated>2025-08-27T03:39:03-07:00</updated><author><name>Eli Bendersky</name></author><id>tag:eli.thegreenplace.net,2025-08-26:/2025/implementing-forth-in-go-and-c/</id><summary type="html">&lt;p&gt;I first ran into Forth about 20 years ago when reading a book about
&lt;a class="reference external" href="https://www.oreilly.com/library/view/designing-embedded-hardware/0596007558/"&gt;designing embedded hardware&lt;/a&gt;.
The reason I got the book back then was to actually learn more about the HW
aspects, so having skimmed the Forth chapter I just registered an &amp;quot;oh, this is neat&amp;quot;
mental note …&lt;/p&gt;</summary><content type="html">&lt;p&gt;I first ran into Forth about 20 years ago when reading a book about
&lt;a class="reference external" href="https://www.oreilly.com/library/view/designing-embedded-hardware/0596007558/"&gt;designing embedded hardware&lt;/a&gt;.
The reason I got the book back then was to actually learn more about the HW
aspects, so having skimmed the Forth chapter I just registered an &amp;quot;oh, this is neat&amp;quot;
mental note and moved on with my life. Over the last two decades I
heard about Forth a few more times here and there, such as that time when
&lt;a class="reference external" href="https://factorcode.org/"&gt;Factor&lt;/a&gt; was talked about for a brief period, maybe
10-12 years ago or so.&lt;/p&gt;
&lt;p&gt;It always occupied a slot in the &amp;quot;weird language&amp;quot; category inside my brain, and
I never paid it much attention. Until June this year, when a couple of factors
combined fortuitously:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;After spending much of the &lt;a class="reference external" href="https://eli.thegreenplace.net/archives/2025"&gt;earlier part of 2025&lt;/a&gt;
exploring the inner workings
of LLMs and digging in random mathy and algorithmic topics, I had an itch
to just write some code.&lt;/li&gt;
&lt;li&gt;I somehow found &lt;a class="reference external" href="https://ratfactor.com/forth/the_programming_language_that_writes_itself.html"&gt;Dave Gauer's page about Forth&lt;/a&gt;
and also the one on &lt;a class="reference external" href="https://ratfactor.com/forth/implementing"&gt;Implementing a Forth&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And something clicked. I'm going to implement a Forth, because... why not?&lt;/p&gt;
&lt;p&gt;So I spent much of my free hacking time over the past two months learning
about Forth and implementing &lt;em&gt;two&lt;/em&gt; of them.&lt;/p&gt;
&lt;div class="section" id="forth-the-user-level-and-the-hacker-level"&gt;
&lt;h2&gt;Forth: the user level and the hacker level&lt;/h2&gt;
&lt;p&gt;It's useful to think of Forth (at least &lt;a class="reference external" href="https://forth-standard.org/"&gt;standard Forth&lt;/a&gt;,
not offshoots like Factor) as having two different &amp;quot;levels&amp;quot;:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;strong&gt;User&lt;/strong&gt; level: you just want to use the language to write programs. Maybe
you're indeed bringing up new hardware, and find Forth a useful
calculator + REPL + script language. You don't care about Forth's
implementation or its soul, you just want to complete your task.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hacker&lt;/strong&gt; level: you're interested in the deeper soul of Forth. Isn't it
amazing that even control flow constructs like &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;IF...THEN&lt;/span&gt;&lt;/tt&gt; or loops like
&lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;BEGIN...UNTIL&lt;/span&gt;&lt;/tt&gt; are just Forth words, and if you wanted, you could implement
your own control flow constructs and have them be first-class citizens, as
seamless and efficient as the standard ones?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Another way to look at it (useful if you belong to a certain crowd) is that
user-level Forth is like Lisp without macros, and hacker-level Forth has macros
enabled. Lisp can still be great and useful without macros, but macros take
it to an entire new level and also unlock the deeper soul of the language.&lt;/p&gt;
&lt;p&gt;This distinction will be important when discussing my Forth implementations
below.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="goforth-and-ctil"&gt;
&lt;h2&gt;goforth and ctil&lt;/h2&gt;
&lt;img alt="Logo of goforth" class="align-center" src="https://eli.thegreenplace.net/images/pages/goforth-logo-sm.png" /&gt;
&lt;p&gt;There's a certain way Forth is supposed to be implemented; this is how it was
originally designed, and if you get closer to the hacker level, it
becomes apparent that you're pretty much required to implement it this way -
otherwise supporting all of the language's standard words will be very
difficult. I'm talking about the classical approach of a linked dictionary,
where a word is represented as a &amp;quot;threaded&amp;quot; list &lt;a class="footnote-reference" href="#footnote-1" id="footnote-reference-1"&gt;[1]&lt;/a&gt;, and this dictionary is
available for user code to augment and modify. Thus, much of the Forth
implementation can be written in Forth itself.&lt;/p&gt;
&lt;p&gt;The first implementation I tried is stubbornly different. Can we just make a
pure interpreter? This is what &lt;a class="reference external" href="https://github.com/eliben/goforth"&gt;goforth&lt;/a&gt;
is trying to explore (the Go implementation located in the root directory of
that repository). Many built-in words are supported - definitely enough to
write useful programs - and compilation
(the definition of new Forth words using &lt;tt class="docutils literal"&gt;: word ... ;&lt;/tt&gt;) is implemented by
storing the actual string following the word name in the dictionary, so it can
be interpreted when the word is invoked.&lt;/p&gt;
&lt;p&gt;This was an interesting approach and in some sense, it &amp;quot;works&amp;quot;. For the user
level of Forth, this is perfectly usable (albeit slow). However, it's
insufficient for the hacker level, because the host language interpreter (the
one in Go) has all the control, so it's impossible to implement &lt;tt class="docutils literal"&gt;&lt;span class="pre"&gt;IF...THEN&lt;/span&gt;&lt;/tt&gt; in
Forth, for example (it has to be implemented in the host language).&lt;/p&gt;
&lt;p&gt;That was a fun way to get a deeper sense of what Forth is about, but I did want
to implement the hacker level as well, so the second implementation -
&lt;a class="reference external" href="https://github.com/eliben/goforth/tree/main/ctil"&gt;ctil&lt;/a&gt; - does just that.
It's inspired by the &lt;a class="reference external" href="http://git.annexia.org/?p=jonesforth.git"&gt;jonesforth&lt;/a&gt;
assembly implementation, but done in C instead &lt;a class="footnote-reference" href="#footnote-2" id="footnote-reference-2"&gt;[2]&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;ctil actually lets us implement major parts of Forth in Forth itself. For
example, &lt;tt class="docutils literal"&gt;variable&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;variable&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;create&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;cells&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;allot&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Conditionals:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;\ IF, ELSE, THEN work together to compile to lower-level branches.&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\ IF ... THEN compiles to:&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\   0BRANCH OFFSET true-part rest&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\ where OFFSET is the offset of rest&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\ IF ... ELSE ... THEN compiles to :&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\   0BRANCH OFFSET true-part BRANCH OFFSET2 false-part rest&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="c1"&gt;\ where OFFSET is the offset of false-part and OFFSET2 is the offset of rest&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="kn"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;if&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;immediate&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nf"&gt;&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;here&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="kn"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;then&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;immediate&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;dup&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;here&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;swap&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;swap&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;

&lt;span class="kn"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;else&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;immediate&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nf"&gt;&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;branch&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;here&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;,&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;swap&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;dup&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;here&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;swap&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;-&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;swap&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;These are actual examples of ctil's &amp;quot;prelude&amp;quot; - a Forth file loaded before any
user code. If you understand Forth, this code is actually rather mind-blowing.
We compile &lt;tt class="docutils literal"&gt;IF&lt;/tt&gt; and the other words by directly laying our their low-level
representation in memory, and different words communicate with each other
using the data stack &lt;em&gt;during compilation&lt;/em&gt;.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="thoughts-on-forth-itself"&gt;
&lt;h2&gt;Thoughts on Forth itself&lt;/h2&gt;
&lt;p&gt;Forth made perfect sense in the historic context in which it was created in
the early 1970s. Imagine having some HW connected to your computer (a telescope
in the case of Forth's creator), and you have to interact with it. In terms
of languages at your disposal - you don't have much, even BASIC wasn't invented
yet. Perhaps your machine still didn't have a C compiler ported to it; C
compilers aren't simple, and C isn't very great for exploratory scripting
anyway. So you mostly just have your assembly language and whatever you build
on top.&lt;/p&gt;
&lt;p&gt;Forth is easy to implement in assembly and it gives you a much higher-level
language; you can use it as a calculator, as a REPL, and as a DSL for pretty
much anything due to its composable nature.&lt;/p&gt;
&lt;p&gt;Forth certainly has interesting aspects; it's a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Concatenative_programming_language"&gt;concatenative language&lt;/a&gt;,
and thus inherently &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Tacit_programming"&gt;point-free&lt;/a&gt;.
A classical example is that instead of writing the following in a more
traditional syntax:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;eat(bake(prove(mix(ingredients))))
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;You just write this:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;ingredients mix prove bake eat
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There is no need to explicitly pass parameters, or to explicitly return results.
Everything happens implicitly on the stack.&lt;/p&gt;
&lt;p&gt;This is useful for REPL-style programming where you use your language not
necessarily for writing large programs, but more for interactive instructions to
various HW devices. This dearth of syntax is also what makes Forth simple
to implement.&lt;/p&gt;
&lt;p&gt;All that said, in my mind Forth is firmly in the &amp;quot;weird language&amp;quot; category;
it's instructive to learn and to implement, but I wouldn't actually use it
for anything real these days. The stack-based programming model is cool for
very terse point-free programs, but it's not particularly readable and hard
to reason about without extensive comments, in my experience.&lt;/p&gt;
&lt;p&gt;Consider the implementation of a pretty standard Forth word: &lt;tt class="docutils literal"&gt;+!&lt;/tt&gt;. It expects
and address at the top of stack, and an addend below it. It adds the addend to
the value stored at that address. Here's a Forth implementation from
ctil's prelude:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;+!&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="c1"&gt;( addend addr -- )&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;tuck&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;( addr addend addr )&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;@&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;( addr addend value-at-addr )&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;+&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="c1"&gt;( addr updated-value )&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;swap&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="c1"&gt;( updated-value addr )&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;!&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Look at that stack wrangling! It's really hard to follow what goes where without
the detailed comments showing the stack layout on the right of each instruction
(a common practice for Forth programs). Sure, we can create additional words
that would make this simpler, but that just increases the lexicon of words to
know.&lt;/p&gt;
&lt;p&gt;My point is, there's fundamental difficulty here. When you see this C code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;bar&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Even without any documentation, you can immediately know several important
things:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;bar&lt;/tt&gt; has one parameter and one return value&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; has two parameters and one return value&lt;/li&gt;
&lt;li&gt;&lt;tt class="docutils literal"&gt;func&lt;/tt&gt; also has two parameters and one return value&lt;/li&gt;
&lt;li&gt;It's immediately obvious how the various values flow from one function call
to the next.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Written in Forth &lt;a class="footnote-reference" href="#footnote-3" id="footnote-reference-3"&gt;[3]&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nc"&gt;func&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;bar&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;;&lt;/span&gt;&lt;span class="w"&gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;How can you know the arity of the functions without adding explicit comments?
Sure, if you have a handful of words like &lt;tt class="docutils literal"&gt;bar&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;foo&lt;/tt&gt; you know like the
back of your hand, this is easy. But imagine reading a large, unfamiliar code
base full of code like this and trying to comprehend it.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="summary-and-links"&gt;
&lt;h2&gt;Summary and links&lt;/h2&gt;
&lt;p&gt;The source code of my &lt;a class="reference external" href="https://github.com/eliben/goforth"&gt;goforth project is on GitHub&lt;/a&gt;; both
implementations are there, with a comprehensive test harness that tests both.&lt;/p&gt;
&lt;p&gt;The learn Forth itself, I found these resources very useful:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="https://ratfactor.com/forth/the_programming_language_that_writes_itself.html"&gt;Dave Gauer's Forth page&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://www.forth.com/starting-forth/"&gt;Starting Forth&lt;/a&gt; - a free online
book / tutorial on Forth for beginners&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To learn how to implement Forth:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="https://ratfactor.com/forth/implementing"&gt;Dave Gauer's page on Implementing a Forth&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://git.annexia.org/?p=jonesforth.git"&gt;jonesforth&lt;/a&gt; implementation&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="https://archive.org/details/R.G.LoeligerThreadedInterpretiveLanguagesTheirDesignAndImplementationByteBooks1981"&gt;Threaded Interpretive Languages&lt;/a&gt; - an
old but nice book that explains how Forth implementations typically work&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Implementing Forth is a great self-improvement project for a coder; there's a
pleasantly challenging hump of understanding to overcome, and you gain valuable
insights into stack machines, interpretation vs. compilation and mixing these
levels of abstraction in cool ways.&lt;/p&gt;
&lt;p&gt;Also, implementing programming languages
from scratch is fun! It's hard to beat the feeling of getting to interact with
your implementation for the first time, and then iterating on improving it
and making it more featureful. &lt;a class="reference external" href="https://www.urbandictionary.com/define.php?term=One+More+Turn+Syndrome"&gt;Just one more word&lt;/a&gt;!&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-1" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-1"&gt;[1]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;This has nothing to do with threads in the sense of concurrency.
Rather, it's thread like in sewing, where the elements of the list
are all connected to each other as if with a thread. See
&lt;a class="reference external" href="https://wiki.c2.com/?ThreadedInterpretiveLanguage"&gt;this page&lt;/a&gt; for
more details.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-2" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-2"&gt;[2]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;&lt;p class="first"&gt;Which is another deviation from the norm. Forth is really supposed to
be implemented in assembly - this is what it was designed for, and it's
very clear from its structure that it must be so in order to achieve
peak performance.&lt;/p&gt;
&lt;p&gt;But where's the fun in doing things the way they were supposed to be
done? Besides, jonesforth is already a perfectly fine Forth implementation
in assembly, so I wouldn't have learned much by just copying it.&lt;/p&gt;
&lt;p class="last"&gt;I had a lot of fun coding in C for this one; it's been a while since I
last wrote non-trivial amounts of C, and I found it very enjoyable.&lt;/p&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;table class="docutils footnote" frame="void" id="footnote-3" rules="none"&gt;
&lt;colgroup&gt;&lt;col class="label" /&gt;&lt;col /&gt;&lt;/colgroup&gt;
&lt;tbody valign="top"&gt;
&lt;tr&gt;&lt;td class="label"&gt;&lt;a class="fn-backref" href="#footnote-reference-3"&gt;[3]&lt;/a&gt;&lt;/td&gt;&lt;td&gt;Assuming the convention that multi-parameter functions have their
parameters pushed to the stack from left to right.&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
</content><category term="misc"></category><category term="C &amp; C++"></category><category term="Compilation"></category><category term="Go"></category></entry></feed>