Blackpool is a bytecode virtual machine and programming language implementation written in Rust. It started as a Lox implementation from Robert Nystrom's excellent Crafting Interpreters, but has diverged to explore my own design preferences and serve as a platform for learning low-level language implementation techniques.
This is a complete, working implementation of:
- A bytecode compiler — single-pass compilation with a Pratt parser for operator precedence
- A stack-based virtual machine — executes bytecode with call frames, local/global variables, and control flow
- A mark-and-sweep garbage collector — tricolor abstraction with weak references for string interning
- First-class functions and closures — including upvalue capture for variables from enclosing scopes
- Classes with single inheritance — methods, initializers,
this, andsuper
The codebase is roughly 4,700 lines of Rust with 247 integration test files covering language semantics.
$ cargo test
...
test result: ok. 237 passed; 0 failed; 0 ignored; 0 measured; 0 filtered outThe most interesting part of this project (to me) is the garbage collector. Implementing GC in Rust is a delightful puzzle because Rust's ownership model actively fights you — which is exactly the point.
The collector uses a tricolor mark-and-sweep algorithm:
- White — Object hasn't been visited. At the end of collection, white objects are garbage.
- Gray — Object is reachable but we haven't traced its references yet. These live in a work queue.
- Black — Object is reachable and we've traced all its references.
The invariant is simple: black objects never point to white objects. The wavefront of gray objects advances until nothing is gray, then we sweep the remaining white objects.
Strings get special treatment. The VM maintains a hash table of all strings for deduplication (so "hello" == "hello" is a pointer comparison, not a string comparison). But if we treated this table as a GC root, strings would never be collected.
The solution is weak references — the string table can reference strings without preventing their collection. During the sweep phase, we first remove any entries pointing to unmarked strings, then free the unmarked objects. This is a real-world example of the weak reference pattern you see in production garbage collectors.
The GC implementation borrows heavily from Manuel Ceron's Loxido, who solved the Rust ownership puzzle elegantly. The core insight is using indices into a vector rather than raw pointers, with a Reference<T> type that carries a PhantomData marker for type safety without actual pointer semantics. This lets us have a single Collector that owns all heap objects while still maintaining type information at the reference site.
Closures capture variables from enclosing scopes. The tricky part is that captured variables might outlive their stack frame — if you return a closure from a function, the closure needs those variables to still exist.
The solution is upvalues. When a closure captures a variable:
- If the variable is still on the stack, the upvalue points to that stack slot
- When the variable goes out of scope, we "close" the upvalue — moving the value from the stack into the upvalue itself
This lets closures work correctly even when the enclosing function has returned. The VM maintains a list of open upvalues and closes them as stack frames are popped.
Building this project taught me:
- How bytecode VMs actually work — the fetch-decode-execute loop, stack manipulation, call frames
- Garbage collection algorithms — not just the theory, but the practical concerns around roots, weak references, and collection triggers
- Rust's ownership model from the other side — implementing a GC in a language designed to eliminate the need for GC is a great way to understand both
- Single-pass compilation — generating bytecode while parsing, without building an AST first
- Closure implementation — upvalues are a clever solution to a real problem
# Run tests
cargo test
# REPL
cargo run
# Run a script
cargo run -- script.lox$ cargo run
> print 3 + 4 * 2;
11
> fun counter() { var n = 0; fun inc() { n = n + 1; return n; } return inc; }
> var c = counter();
> print c();
1
> print c();
2- Robert Nystrom for Crafting Interpreters — the best programming book I've read
- Manuel Ceron for Loxido — his Rust implementation and blog post were invaluable for understanding how to structure the GC