Skip to content

explore: indexFile OOM paths — per-parse temp Explorer panics the process; failure-restore reads freed prior_content #594

@justrach

Description

@justrach

Problem

Two defects on indexFile's failure path, found by sweeping a fail-once allocator across a re-index:

  1. Every indexFile builds a throwaway full Explorer (parseContentForIndexing, src/explore.zig:1307) just to carry an allocator into the line parsers — allocating and memsetting a 4096-slot ContentCache (~330KB) per parsed file. Worse, ContentCache.init panics on allocation failure, so an OOM during routine indexing aborts the daemon instead of returning error.OutOfMemory. The failing test below dies with panic: ContentCache.init: OOM allocating 4096 slots before it can even exercise defect 2.

  2. commitParsedFileOwnedOutline reads prior_content out of the content cache and then calls contents.put (src/explore.zig:862-863), which frees that value in place. The trigram-failure errdefer (src/explore.zig:876) later "restores" the word index by re-indexing prior_content — freed memory. Under the DebugAllocator poison the tokenizer finds no words, so a failed re-index silently wipes the file's postings instead of restoring them; release builds tokenize whatever the allocator reused the bytes for.

Failing Test

test_explore.zig — on current release tip the suite binary aborts (defect 1); with the panic fixed it fails error.PriorContentLostOnFailedReindex (defect 2):

test "issue-591: failed re-index restores the word index from valid prior content" {
    const FailOnce = struct {
        backing: std.mem.Allocator,
        next_index: usize = 0,
        fail_index: usize,
        // fail exactly the Nth allocation, then recover — std.testing's
        // FailingAllocator fails forever from fail_index on, which would also
        // starve the errdefer restore we are trying to verify
        ... (vtable forwarding to backing, alloc fails once at fail_index) ...
    };

    // Pass 1: count the allocation window of the re-index on a successful run.
    // Pass 2: fail each allocation in that window exactly once; whenever the
    // re-index errors, the OLD content must still be word-searchable:
    //     if (explorer.indexFile("src/a.zig", "pub fn betaTok() void {}\n")) |_| {
    //         try testing.expect(explorer.word_index.search("betatok").len > 0);
    //     } else |_| {
    //         if (explorer.word_index.search("alphatok").len == 0)
    //             return error.PriorContentLostOnFailedReindex;
    //     }
}

Expected

OOM during indexing is an error, not a process abort; a failed re-index leaves the word index reflecting the old content (the restore reads live bytes).

Fix

Split Explorer.init into a fallible initFallible (used by init, which keeps its panic for real explorers) and give parseContentForIndexing a 1-slot parser shell via try initFallible(allocator, 1) — the parsers only use self.allocator. Move contents.put after the word/trigram updates in commitParsedFileOwnedOutline so prior_content stays valid through the entire fallible region the errdefer protects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpriority:p0Highest priority

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions