fix(extract): normalize slugs to lowercase via pathToSlug() (T-OBS-1) by Freddy-Cach · Pull Request #736 · garrytan/gbrain

Freddy-Cach · 2026-05-08T07:14:16Z

The extractor was generating from_slug and the allSlugs lookup set from
relPath.replace('.md', '') in 5 places, producing CAPS slugs for files
named ETHOS.md, AGENTS.md, ROADMAP.md, etc.

Pages persist in the DB with lowercase slug (core/sync.ts pathToSlug()
applies .toLowerCase()). The CAPS extractor output mismatched the DB rows,
so INSERT ... JOIN pages ON pages.slug = v.from_slug silently dropped
links from CAPS-named source files. The link batch returned 'inserted'
counts that were lower than the wikilinks actually present, with no error.

Reproduction (in a brain with CAPS-named canonical docs):

echo 'See agents.' > ETHOS.md
gbrain put ethos < ETHOS.md # page row: slug='ethos'
gbrain extract links --source fs
gbrain backlinks agents → [] (expected: contains 'ethos')

Fix: import pathToSlug from core/sync.ts and use it in all 5 sites:

extractLinksFromFile (line 200): from_slug derivation
runIncrementalExtractInternal (line 456): allSlugs set
extractLinksFromDir (line 552): allSlugs set
timeline loop (line 643): from_slug for timeline entries
extractLinksForSlugs (line 673): allSlugs set used by sync hook

This single-line-per-site change keeps the extractor consistent with the
sync layer's slug normalization and doesn't introduce any new behavior
for already-lowercase paths (idempotent).

Tests: added 'extractLinksFromFile — slug normalization (T-OBS-1
regression)' suite with 4 cases covering CAPS, mixed-case, idempotent
lowercase, and nested path. Full extract suite (54 → 58 tests) passes.

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

… SDK and Voyage's actual contract The @ai-sdk/openai-compatible package treats Voyage as if it were OpenAI-shaped, but Voyage's /v1/embeddings endpoint diverges in three places that combine into a hard-blocking incompatibility: OUTBOUND request: - 'encoding_format=float' (SDK default) is rejected; Voyage only accepts 'base64' - 'dimensions' parameter (OpenAI name) is rejected; Voyage uses 'output_dimension' INBOUND response: - With encoding_format=base64, 'embedding' is returned as a base64 string, but the SDK's Zod schema (openaiTextEmbeddingResponseSchema) expects an 'array of number'. The schema fails with 'Invalid JSON response' even though the JSON is well-formed. - 'usage' lacks 'prompt_tokens'; the schema requires it when usage is present. Without this patch, ALL embedding requests to Voyage fail. Reproducible by running 'gbrain put <slug> < text' with embedding_model=voyage:voyage-* and any current voyage model (voyage-3-large, voyage-3, voyage-4-large). Solution: pass a custom 'fetch' to createOpenAICompatible only when recipe.id === 'voyage'. The fetch wrapper: 1. Forces encoding_format='base64' on outbound (Voyage's only accepted value) 2. Translates dimensions -> output_dimension on outbound 3. Drops Content-Length so the runtime recomputes from the mutated body 4. Decodes base64 embeddings to Float32 arrays on inbound (so the Zod schema sees what it expects) 5. Synthesizes prompt_tokens from total_tokens when missing This is a minimal, targeted fix. It only activates for Voyage and falls through cleanly for all other providers. No public API changes.

The extractor was generating from_slug and the allSlugs lookup set from `relPath.replace('.md', '')` in 5 places, producing CAPS slugs for files named ETHOS.md, AGENTS.md, ROADMAP.md, etc. Pages persist in the DB with lowercase slug (core/sync.ts pathToSlug() applies .toLowerCase()). The CAPS extractor output mismatched the DB rows, so INSERT ... JOIN pages ON pages.slug = v.from_slug silently dropped links from CAPS-named source files. The link batch returned 'inserted' counts that were lower than the wikilinks actually present, with no error. Reproduction (in a brain with CAPS-named canonical docs): 1. echo 'See [agents](agents.md).' > ETHOS.md 2. gbrain put ethos < ETHOS.md # page row: slug='ethos' 3. gbrain extract links --source fs 4. gbrain backlinks agents → [] (expected: contains 'ethos') Fix: import pathToSlug from core/sync.ts and use it in all 5 sites: - extractLinksFromFile (line 200): from_slug derivation - runIncrementalExtractInternal (line 456): allSlugs set - extractLinksFromDir (line 552): allSlugs set - timeline loop (line 643): from_slug for timeline entries - extractLinksForSlugs (line 673): allSlugs set used by sync hook This single-line-per-site change keeps the extractor consistent with the sync layer's slug normalization and doesn't introduce any new behavior for already-lowercase paths (idempotent). Tests: added 'extractLinksFromFile — slug normalization (T-OBS-1 regression)' suite with 4 cases covering CAPS, mixed-case, idempotent lowercase, and nested path. Full extract suite (54 → 58 tests) passes. Reported by Claude Code (Opus 4.7) during Obsidian PKM integration on the gstack-plan Living Repo, where ~111 wikilinks pointing to ETHOS, AGENTS, ROADMAP, etc. failed to count toward brain_score (54/100 vs expected 75+/100). Documented as T-OBS-1 in the consumer's blocked.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

garrytan · 2026-05-10T03:51:01Z

Closing — your fix landed in master via the v0.30.3 fix-wave PR #776 (merged at ff53a4c9): "normalize slugs to lowercase via pathToSlug()".

Thank you for the contribution — credit is preserved in the v0.30.3 CHANGELOG entry. 🙏

Freddy-Cach and others added 2 commits May 7, 2026 16:14

garrytan mentioned this pull request May 9, 2026

v0.31.1.1-fixwave fix-wave: 22 community fixes (auth-code P0, upgrade-path, sync, multi-source, privacy) #776

Merged

5 tasks

garrytan closed this May 10, 2026

garrytan mentioned this pull request May 10, 2026

Bug: extract links --source fs creates 0 links due to slug normalization mismatch #742

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(extract): normalize slugs to lowercase via pathToSlug() (T-OBS-1)#736

fix(extract): normalize slugs to lowercase via pathToSlug() (T-OBS-1)#736
Freddy-Cach wants to merge 2 commits intogarrytan:masterfrom
Freddy-Cach:claude/gbrain-extract-lowercase-slugs

Freddy-Cach commented May 8, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

garrytan commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Freddy-Cach commented May 8, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

garrytan commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Freddy-Cach commented May 8, 2026 •

edited by blacksmith-sh Bot

Loading