Releases · stoolap/stoolap

What's New in v0.4.0

Immutable Volume Storage Engine

Stoolap now splits each table into a hot MVCC buffer (recent writes, WAL-backed) and cold frozen volumes (historical data, column-major). The query engine merges both sources transparently.

Cold volume format (STV4):

Per-column per-row-group LZ4 compression with streaming CRC32 verification
Zone maps (min/max per column per 64K row group) for scan pruning
Bloom filters for point-lookup acceleration
Dictionary encoding for text columns
Deferred column loading with tiered eviction (hot/warm/cold memory tiers)

Lifecycle:

PRAGMA CHECKPOINT seals hot rows into immutable .vol files, persists manifests, truncates WAL
Bounded compaction: sub-target volumes merge, oversized volumes split, dirty volumes rewrite
Background compaction thread (non-blocking checkpoint cycles)
Cutoff-filtered seal and compaction during snapshot isolation transactions

Crash safety:

Fsync-before-rename on all atomic writes (volumes, manifests, catalog)
Two-phase WAL: only committed transactions applied during recovery
Manifests loaded before WAL replay for idempotent recovery

Columnar Aggregate Pushdown

Filtered and grouped aggregates computed directly on raw column arrays without constructing Row objects:

Filtered aggregates: SUM/COUNT/MIN/MAX/AVG with typed predicates on i64/f64/dictionary columns
Grouped aggregates: Single-column GROUP BY with dictionary-indexed accumulators (zero hashing) or FxHashMap for numeric columns
Dictionary DISTINCT: Extracts unique values from dictionary metadata without scanning rows
Zone-map pruning: Volumes and row groups skipped when predicates prove no match
IN list pruning: WHERE id IN (...) generates min/max bounds for zone-map elimination

Query Performance

ORDER BY PK + LIMIT: K-way merge across sorted volume row_ids, stopping after limit rows
MIN/MAX typed scan: Direct i64/f64/timestamp access with zone-map volume pruning
OFFSET skip: Cold scan skips row materialization for offset rows
Parallel cold scanning: Rayon-based parallel volume processing (4+ volumes, 100K+ rows threshold)

Foreign Key Improvements

Recursive ON UPDATE CASCADE: Cascades through the full FK chain including grandchild tables, with depth limiting (16 levels)
Referenced UNIQUE column cascade: FK cascade generalized from PK-only to any referenced UNIQUE column
Pre-check RESTRICT: RESTRICT constraints checked before writing parent rows, preserving statement-level atomicity in explicit transactions
SET NULL recursion: SET NULL arm recurses through descendants so deeper RESTRICT checks are enforced

Primary Key Update Protection

UPDATE on primary key columns is now rejected with a clear error message. The engine uses row_id == pk_value as a core invariant across ~50 code paths. This matches SQLite's behavior for rowid tables. Use DELETE + INSERT to change a row's primary key value.

Calendar-Aware INTERVAL Arithmetic

INTERVAL '1 month' and INTERVAL '1 year' now use proper calendar logic instead of 30-day/365-day approximations. Handles leap years, variable month lengths, and preserves nanosecond precision. Matches DATE_ADD behavior.

Schema Evolution Fixes

CREATE INDEX after DROP+ADD COLUMN: validate_cold_unique and populate_index_from_cold now use column mapping to correctly translate schema indices to physical volume indices
AS OF PK dedup: Resolves PK column through mapping for correct cold row dedup after schema evolution
Partition grouping: Uses snapshotted cs.mapping instead of live lookup (immune to compaction races)

ALTER TABLE MODIFY COLUMN Validation

MODIFY COLUMN ... NOT NULL now validates existing data with a streaming IS NULL scan before applying the constraint. Returns a clear error if any NULL values exist.

Data Integrity

Compaction TOCTOU fix: Snapshot sequence limit captured per-table (matches seal's per-table pattern)
Manifest truncation errors: Tombstones, column renames, and dropped columns return errors instead of silent data loss on truncated manifests
Volume corruption guards: Dictionary IDs and bytes offsets validated at deserialization, preventing panics on corrupted volume files
Aggregation pushdown correctness: Bail on partial WHERE pushdown (prevents wrong results when memory filter is needed)
Scanner column pruning: Materialize all columns when filter column indices can't be determined
Seal race fix: collect_rows_with_limit uses collect_hot_row_ids_into instead of has_row_id point lookups

Configuration

New DSN parameters for volume storage tuning:

Parameter	Default	Description
`checkpoint_interval`	60	Seconds between checkpoint cycles
`compact_threshold`	4	Sub-target volumes per table before merging
`target_volume_rows`	1048576	Target rows per cold volume (min 65536)
`checkpoint_on_close`	on	Seal all hot rows on clean shutdown
`volume_compression`	on	LZ4 compression for cold volume files
`sync_mode`	normal	none/off, normal, full (or 0, 1, 2)

Invalid DSN parameter values now return errors instead of silently using defaults.

Migration from v0.3.7

Existing v0.3.7 databases are automatically migrated on first open:

Legacy snapshot .bin files loaded into hot buffer
WAL entries replayed
Hot data sealed into immutable .vol files
snapshots/ directory removed

Legacy DSN parameter names (snapshot_interval, snapshot_compression) are accepted for backward compatibility.

Other Changes

Build timestamp embedded in version_info() output
CLI error paths ensure database cleanup before exit
Stale group cache cleared in volume scanner (prevents panic at row-group boundaries)
Eviction epoch off-by-one corrected
WASM binary rebuilt with warning-free compilation

Full Changelog: v0.3.7...v0.4.0

What's New in v0.3.7

PostgreSQL-Inspired DISTINCT ON

New DISTINCT ON (expr, ...) syntax for per-group deduplication, returning the first row for each unique combination of the specified expressions.

Hash-based dedup with O(groups) memory, correctly handles arbitrary ORDER BY patterns including non-leading sort orders
Pipeline order: ORDER BY, DISTINCT ON, column removal, LIMIT/OFFSET
Works across all query paths: single-table scans, JOINs, CTEs, subqueries, and complex ORDER BY
Supports aliased keys, computed expressions, keys not in SELECT, qualified identifiers, and NULL equality
Guards distinct index pushdown to prevent bypassing key-based dedup

-- First (highest) order per customer
SELECT DISTINCT ON (customer) customer, amount, order_date
FROM orders
ORDER BY customer, amount DESC;

-- Per-group dedup on joins with qualified keys
SELECT DISTINCT ON (c.name) c.name, p.amount
FROM customers c JOIN purchases p ON c.id = p.customer_id
ORDER BY c.name, p.amount DESC;

ON CONFLICT Upsert (PostgreSQL-Style)

ON CONFLICT (cols) DO UPDATE SET and DO NOTHING syntax
EXCLUDED pseudo-table to reference attempted insert values
Conflict target matching against PK and composite unique constraints
CHECK constraint validation during upsert updates
RETURNING clause and INSERT ... SELECT support for conflict handling

INSERT INTO users (id, name, email) VALUES (1, 'Alice', 'alice@example.com')
ON CONFLICT (id) DO UPDATE SET name = EXCLUDED.name, email = EXCLUDED.email;

Constant Folding and Non-Deterministic Function Tracking

Compile-time constant folding for deterministic column-free expressions
FunctionInfo.deterministic flag with registry-based lookup
NOW, CURRENT_DATE, CURRENT_TIMESTAMP, RANDOM, SLEEP, EMBED marked non-deterministic
Semantic cache bypass for queries with non-deterministic functions
Pushdown evaluator for stable expressions like NOW() - INTERVAL '24h'

GROUP BY and Aggregation Optimizations

3-column GROUP BY uses tuple keys (no Vec heap allocation)
4+ columns use direct AHashMap (replaces hash-collision approach)
Early termination extended to 1, 2, and 3-column paths
FIRST/LAST aggregates support ORDER BY with O(1) sort-key tracking

Snapshot and Persistence

Persist default_value in both WAL and snapshot serialization
Re-record index and view DDL to WAL during snapshots so they survive truncation
Skip snapshot when WAL has not grown since last snapshot
Auto-snapshot loop sleeps min(cleanup, snapshot) interval
HNSW m and ef_construction exposed on Index trait for persistence

Performance

Restore panic = "abort" in release profile, recovering 5-15% across all benchmarks
Dev profile optimizations and thin LTO for faster builds

Parser and SQL Compatibility

CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP parsed as niladic functions (SQL standard, no parentheses required)
Keyword identifiers fold to lowercase (PostgreSQL compatibility)
Parse ISO 8601 timestamps with fractional seconds and UTC Z suffix
Keywords accepted as column names in SET assignments

Bug Fixes

Fix transaction INSERT with partial column lists: delegate to full executor pipeline for correct column mapping, default values, type coercion, and FK validation
Fix qualified column ambiguity in joins: ORDER BY and DISTINCT ON with qualified identifiers (e.g., c.name vs p.name) now correctly resolve when both joined tables have the same column name
Fix DISTINCT ON key resolution after projection: qualified keys resolve correctly when projected as bare names or under aliases
Fix classification cache: include DISTINCT ON expressions in hash key to prevent stale cache hits
Fix TimeTruncFunction duration cache miss on zero-duration values

Documentation

Go driver docs synced with stoolap-go README
Driver icons with official brand colors in docs sidebar and page headers

Full Changelog: v0.3.5...v0.3.7

What's New in v0.3.5

FFI Panic Safety

panic = "unwind" in release profile so that catch_unwind boundaries in the C FFI layer work correctly. Previously panic = "abort" made catch_unwind a no-op, meaning any Rust panic would abort the host process (MCP server, Node.js, Python, PHP, Go).
Removed unused release-ffi profile. All drivers already build with --release --features ffi.
Added staticlib crate type for the Go driver's bundled static libraries.

MCP Server Improvements

stoolap://sql-reference resource added for discoverability. Delivers the same live schema and complete SQL reference as the sql-assistant prompt, but as an MCP resource that clients can attach without prompt support.

Go Driver Documentation

New comprehensive Go driver documentation covering both the Direct API and the database/sql driver, with examples for transactions, prepared statements, vector search, bulk fetch, JSON, NULL handling, and concurrency patterns.

Documentation

Updated all FFI build instructions from --profile release-ffi to --release --features ffi across C API docs, header file, benchmark example, building guide, and testing guide
Updated release profile description from panic = "abort" to panic = "unwind" in building docs
Reordered driver pages: Node.js, Python, PHP, Go, WASM, C API, MCP Server

Bug Fixes

Fix view column aliasing: strip table alias prefix from QualifiedIdentifier output column names in post-aggregation expressions (u.username -> username)
Fix window functions on views: materialize view rows and delegate to execute_select_with_window_functions
Fix panic in projection compilation: replace .expect() panics in ExprMappedResult::with_defaults and FilteredResult::with_defaults with proper Result propagation

Full Changelog: v0.3.4...v0.3.5

What's New in v0.3.4

C FFI Layer

Complete C API for embedding Stoolap in any language that can call C functions. Feature-gated with --features ffi.

Opaque handle API with step-based iteration, per-handle error storage, and panic-safe catch_unwind boundaries
Bulk fetch API (stoolap_rows_fetch_all) transfers entire result sets in a single packed binary buffer, eliminating per-row FFI overhead
Prepared statements with pre-compiled plans that bypass cache lookup on every execution
Transaction support with isolation level control (begin, commit, rollback)
Parameter binding for both positional ($1, $2) and named (:key) parameters
Full C header (include/stoolap.h) with documented binary format spec

StoolapDb *db = stoolap_open(":memory:");
stoolap_exec(db, "CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT)");

StoolapStmt *stmt = stoolap_prepare(db, "INSERT INTO users VALUES ($1, $2)");
StoolapValue params[] = { stoolap_int(1), stoolap_text("Alice") };
stoolap_stmt_execute(stmt, params, 2);

Table-Valued Functions

GENERATE_SERIES for integer, float, and timestamp types
LIMIT pushdown, WHERE range clamping, JOIN support, and EXPLAIN integration
TVF infrastructure (parser, AST, executor, registry) for future functions

SELECT * FROM GENERATE_SERIES(1, 10);
SELECT * FROM GENERATE_SERIES('2026-01-01', '2026-12-31', '1 month');
SELECT g.value, t.name FROM GENERATE_SERIES(1, 5) g JOIN tasks t ON t.id = g.value;

Hash and String Functions

Hash/checksum: MD5, SHA1, SHA256, SHA384, SHA512, CRC32
String: STARTS_WITH, ENDS_WITH, CONTAINS

Query Optimizer Improvements

Mixed OR hybrid optimization for queries with both indexed and non-indexed OR branches (index lookup + filtered scan with dedup)
Relaxed multi-column index prefix rule to allow single-column prefix matching
Trailing range scan on composite indexes after equality prefix
BETWEEN decomposition to range comparisons for index use
Subquery index probe safety guard with outer refs or nested subqueries
Aggregation guard to prevent fast-path optimizations on aggregate queries

EXPLAIN Improvements

Partition WHERE predicates to correct join sides
Show residual filters on index scans
Handle ROLLUP/CUBE/GROUPING SETS display

Stoolap Studio

Documentation page with features overview, installation guide, quick tour, and keyboard shortcuts
Responsive dark/light mode screenshots on docs site and README

Bug Fixes

Fix undefined behavior in FFI bulk fetch buffer deallocation: into_boxed_slice() guarantees len == capacity for safe Vec::from_raw_parts reconstruction
Fix CowHashMap Stacked Borrows violation in drop and backward-shift deletion
Fix I64Map Stacked Borrows violation in backward-shift deletion
Fix CowBTree Stacked Borrows violations across all node pointer accesses
Fix CompactArc Stacked Borrows violation: derive data pointer via raw arithmetic instead of Deref
Restore Expression::Case match arm in process_where_subqueries accidentally deleted by mutation testing
Fix qualified outer column resolution in expression compiler to prevent incorrect binding when inner row shadows outer column names
Fix parser handling of consecutive semicolons (SELECT 1;;)
Reject bare expression statements in prepare() to catch typos like SELECTX at prepare time

Testing & CI

14 FFI aggregate function tests (COUNT, SUM, AVG, MIN, MAX, GROUP BY, HAVING, JOINs, subqueries)
67 unit tests for outer reference detection covering all match arms
13 integration tests for correlated subquery outer reference paths
MVCC TransactionRegistry unit tests
sqllogictest suite with comprehensive .slt test files
Failpoint infrastructure for I/O error injection testing
Nightly CI: mutation testing (24 shards), Miri rotation (5 module groups), stress tests, sanitizers
TSAN inline suppression for CompactArc false positive

Documentation

C driver page with full API reference and bulk fetch binary format spec
Node.js driver updated for N-API C addon architecture
Development category: testing, limitations, building from source, contributing
Changelog page fetching releases from GitHub API
Unified header styles and grid overlay across docs site

Full Changelog: v0.3.3...v0.3.4

What's New in v0.3.3

Vector Search Engine

VECTOR(N) column type with packed binary storage for embedding vectors of any dimension
HNSW index for approximate nearest neighbor search with configurable parameters (m, ef_construction, ef_search, distance metric)
Distance functions: VEC_DISTANCE_L2, VEC_DISTANCE_COSINE, VEC_DISTANCE_IP, and <=> operator
Vector search optimizer with HNSW index path (O(log N)) and WHERE post-filtering, or parallel brute-force k-NN with fused distance+topK
Utility functions: VEC_DIMS, VEC_NORM, VEC_TO_TEXT

CREATE TABLE documents (id INTEGER PRIMARY KEY, embedding VECTOR(384));

CREATE INDEX idx_emb ON documents(embedding)
  USING HNSW WITH (m = 16, ef_construction = 200, metric = 'cosine');

SELECT id, VEC_DISTANCE_COSINE(embedding, '[0.1, 0.2, ...]') AS dist
FROM documents ORDER BY dist LIMIT 10;

Built-in Semantic Search

EMBED() function generates 384-dimensional embeddings using a built-in sentence-transformer model (all-MiniLM-L6-v2), no external API calls needed
Enable with --features semantic at build time
Combine with HNSW indexing for end-to-end semantic search in pure SQL

-- Generate embeddings at insert time
INSERT INTO documents (title, embedding)
VALUES ('Password Reset Guide', EMBED('How to recover your account'));

-- Semantic search with a natural language query
SELECT title, VEC_DISTANCE_COSINE(embedding, EMBED('forgot my login')) AS dist
FROM documents ORDER BY dist LIMIT 5;

ANN Benchmark

Self-contained benchmark binary on the public Fashion-MNIST dataset (60,000 vectors, 784 dimensions, 10,000 queries, single-core, full SQL path):

Recall	QPS	p95 Latency	Speedup vs brute-force
95.0%	10,410	0.12 ms	733x
99.0%	6,700	0.19 ms	472x
99.9%	4,159	0.33 ms	293x
100%	913	1.59 ms	64x

See ANN Benchmarks for the full report.

# Run the benchmark yourself
RAYON_NUM_THREADS=1 cargo run --release --example ann_benchmark \
  --features ann-benchmark -- --sweep --runs 5 --max-queries 10000

Other Changes

VACUUM statement and PRAGMA VACUUM for manual cleanup of deleted rows and old versions
Value::Extension refactoring: Value::Json replaced by tag-in-data pattern keeping Value at 16 bytes with room for future types
Lexer fast path: zero-alloc string literal parsing for non-escaped strings

Bug Fixes

Fix #[inline(always)] + #[target_feature] build error on x86_64
Stabilize flaky CI tests (HNSW recall with deterministic PRNG, table name collision, perf threshold)
Fix 38 broken relative links across docs (converted to Jekyll link tags)

Docs & Website

Full website redesign with vector search spotlight, search modal (Cmd/Ctrl+K), accessibility improvements
Blog post: Vector and Semantic Search in SQL
Playground: vector search tables and query chips
Python driver documentation

Full Changelog: v0.3.2...v0.3.3

What's New in v0.3.2

Rust API

Transaction named parameters — execute_named() / query_named() now available on Transaction, matching the Database API

Bug Fixes

UPDATE SET parameter resolution — Fixed UPDATE t SET col = col + $1 WHERE id = $2 failing because positional and named parameters were not passed to SET expression evaluation (resolved to NULL)

Internal

Zero-copy parameter passing in UPDATE — Arc refcount bump instead of deep-cloning all parameter keys/values into the setter closure
Unified execution path — Eliminated internal code duplication via execute_sql_with_ctx
WASM binary updated

Full Changelog: v0.3.1...v0.3.2

What's New in v0.3.1

SQL Engine

Cross-type Timestamp/Text comparison — Value::compare() now supports Timestamp ↔ Text/Json comparison via parse_timestamp fallback
CURRENT_TIME function — Returns HH:MM:SS format, alongside existing CURRENT_DATE and CURRENT_TIMESTAMP
RELEASE SAVEPOINT — Full parser, AST, and executor support
SET/BEGIN isolation level — SET isolation_level and BEGIN with isolation level now work correctly (SNAPSHOT, REPEATABLE READ, SERIALIZABLE, READ COMMITTED, READ UNCOMMITTED)
Double-quoted pattern strings — LIKE, ILIKE, GLOB, REGEXP now accept double-quoted identifiers as pattern strings (SQLite compatibility)
SQLite double-quote fallback — Double-quoted identifiers fall back to string literals when column resolution fails
Improved parser errors — Context-aware messages showing actual tokens and clause context (e.g., "expected expression after WHERE")
Implicit type coercions — Integer↔Float and Integer→Boolean coercions at the storage layer
SHOW CREATE TABLE — Now includes FOREIGN KEY constraints in output
DESCRIBE improvements — Shows UNI key type for single-column unique indexes
EXTRACT fields — Added MILLISECOND, MICROSECOND, ISODOW, EPOCH

Rust API

Cached Plan API — Database::cached_plan(), execute_plan(), query_plan() for pre-parsed SQL reuse without cache lookup overhead
Prepared statement execution in transactions — Transaction::execute_prepared() for batch operations with pre-parsed SQL
Zero-clone row cursor — Rows::advance() / current_row() for zero-clone row iteration (bulk serialization)
ParamVec passthrough — Params impl for ParamVec (identity passthrough), re-exported from lib.rs

Website & Playground

Node.js driver documentation — New "Drivers" category with complete @stoolap/node API reference
WASM playground — Browser-based SQL sandbox with WebAssembly compilation support
Immersive terminal hero — Auto-scrolling terminal with 16 SQL demo scenes
Website redesign — Consolidated CSS, new homepage, blog, and layout templates

Documentation Fixes

Fix row.get("name") → row.get_by_name("name") across 7 doc files
Fix PRAGMA create_snapshot → snapshot (matches implementation)
Add 8 missing connection string options to persistence docs
Fix RowVersion struct in MVCC docs (remove non-existent fields)
Fix DATEDIFF signature, CAST(NULL) behavior, CTE+INSERT limitation
Add STRING_AGG native ORDER BY syntax, recursive CTE iteration limit
Complete rewrite of sql-functions-reference covering all 110 functions
Fix CLI flag -q → -e for executing queries

Internal

Feature-gate rayon parallelism behind parallel feature flag with sequential fallbacks for WASM
Gate thread::spawn/sleep for WASM targets
Add time_compat shim (std::time vs web_time) for WASM Instant/SystemTime
Fix NaN/Infinity panic in WASM value serialization
Fix auto-increment to follow schema flag, not implicit INTEGER PK

Full Changelog: v0.3.0...v0.3.1

What's New in v0.3.0

This release brings foreign key constraint enforcement, a crash-safe WAL/snapshot system, and significant MVCC performance improvements with reduced memory footprint and better concurrency.

Foreign Key Constraints

Full referential integrity enforcement with three referential actions:

RESTRICT (default): Block parent deletion/update when children exist
CASCADE: Propagate deletes/updates to child rows (recursive, depth limit 16)
SET NULL: Set FK columns to NULL when parent is deleted/updated

Key features:

Column-level REFERENCES and table-level FOREIGN KEY syntax with ON DELETE/ON UPDATE
DDL validation: parent table must exist, referenced column must be PK/UNIQUE
Enforcement on INSERT, UPDATE, DELETE, TRUNCATE, and DROP TABLE
Pre-validation of constant FK values in explicit transactions to prevent dirty state
Cached reverse FK mapping with schema epoch invalidation
Auto-created indexes on FK columns for efficient cascade operations
WAL + snapshot persistence for FK metadata
DROP TABLE strips orphaned FK constraints from child schemas

Crash-Safe WAL/Snapshot System

Safe WAL truncation: only truncate to 2nd-to-last CRC-verified snapshot
Snapshot fallback loading: try older snapshots when latest is corrupted
Remove stale checkpoint.meta when all snapshots fail to load
Use min(header_lsn) across tables for crash-safe replay
Capture commit_seq AFTER checkpoint to prevent data loss window
Clean up orphaned snapshot directories from dropped tables
CRC-aware snapshot cleanup: corrupt files don't count toward keep_count
WAL rotation after DDL/DML commits to prevent unbounded growth
Sort WAL files by embedded LSN instead of lexicographic order

Transaction Safety

Write WAL COMMIT marker before making changes visible in registry
Abort transaction on Phase 3 WAL write failure to prevent registry leak
Restore transaction on API commit failure so rollback remains possible
Fix file lock race: acquire lock before truncating PID file
Fix registry override_count underflow from mismatched fetch_sub
Fix race condition where transaction was briefly absent from both maps during commit
Fix abort_transaction to not resurrect already-committed transactions

ALTER TABLE

Implement RENAME COLUMN and MODIFY COLUMN with dual schema updates (version store + cached schema)
Column existence checks in pushdown rules to prevent invalid storage-level predicates
Full WAL and snapshot durability for ALTER TABLE operations

Performance Improvements

CompactArc Header (24 → 16 bytes)

Compile-time drop dispatch via CompactArcDrop trait with monomorphization
Eliminates stored function pointer, replaces indirect call with direct call

CowHashMap for Transaction Registry

O(1) snapshot cloning for lock-free iteration
Replace DashMap<i64, TxnState> with Mutex<CowHashMap<TxnState>>
Thread-local caching in is_directly_visible() minimizes lock contention

MVCC Memory Reduction

Remove create_time from ArenaRowMeta (8 bytes saved per row)
Committed transactions removed from map (implicit state)
Pack TxnState into 16 bytes with bit manipulation
Separate snapshot_seqs map for snapshot isolation commit sequences

Batch Index Operations

add_batch_slice and remove_batch_slice for single-lock operations (O(1) locks instead of O(N))
Two-phase commit (validate-then-modify) with rollback support
Peak memory usage reduced by ~10% (verified with DHAT)

PkIndex: Hybrid Primary Key Index

Hybrid bitset + I64Set with O(1) lookups
Speculative arena probe replaces row_arena_index HashMap (saves ~40 bytes/row)
CowBTree reverse iterators for O(limit) descending ORDER BY

Bug Fixes

Fix transaction-local visibility: replace txn_versions.get() with get_local_version() in 11 lookup paths
Fix hash collision bug in HashIndex where same hash was treated as same values
Fix conflict detection to properly catch UPDATE conflicts via get_latest_version_id()
Fix historical version arena_idx (must be None, slot reused by HEAD)
Fix next_txn_id recovery to not skip transaction IDs
Fix partial commit handling: commit_all_tables returns (bool, Option<Error>)
Fix record_commit error propagation (was silently swallowed)
Consolidate duplicate error variants for consistent error messages across 27 files
CAST(NULL AS type) now returns typed NULL per SQL standard
Add overflow guards: checked_neg/checked_abs for i64::MIN, checked_add in SUM
Fix CAST text→float→integer to reject inf/NaN/out-of-range
Fix ILIKE pattern matching to prevent panics on multi-byte UTF-8
Fix AM/PM format_timestamp sequential replacement interference
Fix expressions_equivalent for In, Like, FunctionCall, Window, Between

Other Changes

TRUNCATE TABLE with WAL persistence and transaction safety checks
Savepoint DDL rollback support (CREATE/DROP TABLE)
Named parameter support (:name) in PK fast path and DML fast path
EXPLAIN plan colorization in CLI with ANSI colors
Snapshot isolation guards on arena fast paths
i128 accumulator for SUM/AVG to prevent integer overflow
Comprehensive durability test suite
Expanded documentation for SQL features

Full Changelog: v0.2.4...v0.3.0

@nhansiromeo

What's New in v0.2.4

This release focuses on memory optimization and MVCC performance, reducing memory footprint by 33% for core data structures and introducing a copy-on-write B+ tree for O(1) MVCC snapshots.

Performance Improvements

CowBTree for O(1) MVCC Snapshots

Replace RwLock<BTreeMap> with copy-on-write B+ tree enabling structural sharing
Readers clone tree root (atomic increment) then iterate without holding locks
Dual refcount system for correct concurrent drop coordination
Rightmost split optimization for sequential inserts

Value Size Reduction (24 → 16 bytes, 33% smaller)

Redesigned SmartString with packed struct and tag byte for niche optimization
Extended CompactArc to support DSTs (str, [T]) with thin pointers (8 bytes)
Option<Value> also 16 bytes with no discriminant overhead

MVCC Memory Footprint Reduction

Remove row_id from RowVersion (8 bytes saved per version)
Remove chain_depth from VersionChainEntry (8 bytes saved)
Use NonZeroU64/NonZeroUsize for arena indices (RowIndex: 24 → 16 bytes)
V2 persistence format without redundant row_id field (backward compatible)

Subquery Caching (5.3x faster)

Cross-query subquery caching with table-based invalidation
Cache entries track referenced tables for selective invalidation on DML

Bug Fixes

Fix double-free in CowBTree internal node merge
Fix dirty read vulnerability: Database::clone() now creates independent executor
Fix join projection column ordering with ColumnSource enum
Fix ORDER BY + LIMIT with Hash index fallback Thanks to @nhansiromeo
Fix FULL OUTER JOIN null row handling in nested loop join

Other Changes

Migrate Value-keyed maps to AHash for HashDoS resistance
Simplify Row storage from 3 variants to 2 (Shared/Owned)
Add LRU bounds to thread-local caches (scalar/IN subquery: 128, semi-join: 256)
Add comprehensive SAFETY comments to unsafe blocks
Panic safety fixes in CompactVec (clone, extend, from_iter)

Full Changelog: v0.2.3...v0.2.4

What's New in v0.2.3

SmartString - Custom SSO String

Inline storage for strings ≤22 bytes (no heap allocation)
Owned (Box<str>) for computed values, Shared (Arc<str>) for cloned values
24-byte size with O(1) clone for shared variant

I64Map - High-Performance Hashmap for i64 Keys

Uses i64::MIN as empty sentinel (row/txn IDs are always ≥0)
FxHash-based with linear probing and backward-shift deletion
~45% faster lookups than FxHashMap<i64, V>

Value Interning

Interned: NULL (all 7 DataTypes), booleans, integers 0-1000
Reduces allocations for frequently used values in rows

Aggregation Fixes & Optimizations

Fixed hash collision bug in single-column GROUP BY (was using hash as key)
Added 2-column tuple optimization (30% faster than Vec<Value>)
Single-column primitive GROUP BY uses I64Map directly

SIMD Pattern Matching

memchr::memmem for LIKE '%pattern%' substring search
Pre-compiled Finder stored in CompiledPattern

O(1) COUNT(*) and COUNT(DISTINCT) Fast Paths

Committed row count atomic counter for O(1) COUNT(*)
get_distinct_count_excluding_null() in Index trait for O(1) COUNT(DISTINCT)
Compiled query cache for COUNT queries with schema epoch validation
Performance: COUNT(*) ~22.5µs → ~2.9µs (7.7x faster)

Background Cleanup and Memory Management

Background cleanup thread for periodic garbage collection
Configurable via CleanupConfig (interval, retention periods)
Arena slot reuse to prevent unbounded memory growth
Memory at program end: 665 KB → 153.6 KB (77% reduction)

Global Pool and Cache Cleanup

clear_version_map_pools() for TransactionVersions pools
clear_program_cache() for expression bytecode LRU cache
clear_classification_cache() for query classification LRU cache

Other Improvements

CompactVec: new insert(), retain(), drain() methods
All indexes migrated to I64Map and CompactVec
Index trait _into methods to avoid allocations
RowIdVec pooled vector for index lookups (256K max cached capacity)
BTree index optimization using Borrow trait for lookups
Thorough SAFETY documentation on unsafe code

Full Changelog: v0.2.2...v0.2.3

Releases: stoolap/stoolap

v0.4.0

What's New in v0.4.0

Immutable Volume Storage Engine

Columnar Aggregate Pushdown

Query Performance

Foreign Key Improvements

Primary Key Update Protection

Calendar-Aware INTERVAL Arithmetic

Schema Evolution Fixes

ALTER TABLE MODIFY COLUMN Validation

Data Integrity

Configuration

Migration from v0.3.7

Other Changes

Uh oh!

v0.3.7

What's New in v0.3.7

PostgreSQL-Inspired DISTINCT ON

ON CONFLICT Upsert (PostgreSQL-Style)

Constant Folding and Non-Deterministic Function Tracking

GROUP BY and Aggregation Optimizations

Snapshot and Persistence

Performance

Parser and SQL Compatibility

Bug Fixes

Documentation

Uh oh!

v0.3.5

What's New in v0.3.5

FFI Panic Safety

MCP Server Improvements

Go Driver Documentation

Documentation

Bug Fixes

Uh oh!

v0.3.4

What's New in v0.3.4

C FFI Layer

Table-Valued Functions

Hash and String Functions

Query Optimizer Improvements

EXPLAIN Improvements

Stoolap Studio

Bug Fixes

Testing & CI

Documentation

Uh oh!

v0.3.3

What's New in v0.3.3

Vector Search Engine

Built-in Semantic Search

ANN Benchmark

Other Changes

Bug Fixes

Docs & Website

Uh oh!

v0.3.2

What's New in v0.3.2

Rust API

Bug Fixes

Internal

Uh oh!

v0.3.1

What's New in v0.3.1

SQL Engine

Rust API

Website & Playground

Documentation Fixes

Internal

Uh oh!

v0.3.0

What's New in v0.3.0

Foreign Key Constraints

Crash-Safe WAL/Snapshot System

Transaction Safety

ALTER TABLE

Performance Improvements

Bug Fixes

Other Changes