fix: preserve source id during sync imports by 100yenadmin · Pull Request #639 · garrytan/gbrain

100yenadmin · 2026-05-05T10:08:42Z

Fixes #540.
Supersedes #541, which was based on the old garrytan/embedding-providers branch and now conflicts with current master.

Plain-English Summary

GBrain now supports multiple named sources. That only works if every page, chunk, tag, and version is written back to the same source it came from.

Before this fix, gbrain sync --source docs could correctly remember that it was syncing docs, but the import/write path could still reconcile content by slug alone. If two sources both had README.md, the named source could accidentally touch default. That is data bleed, and it makes multi-source brains unsafe for real teams.

This PR makes source identity part of the write contract.

flowchart LR
  A["source: docs"] --> B["sync/import"]
  C["source: support-kb"] --> B
  B --> D["page key: source_id + slug"]
  D --> E["pages"]
  D --> F["chunks"]
  D --> G["tags"]
  D --> H["versions"]
  E --> I["search stays scoped to the right source"]

Maintainer Notes

Threads optional sourceId from full sync and incremental sync into runImport() / importFile().
Threads optional sourceId through importFromContent() and code-file import.
Scopes page writes, deletes, chunk writes, tag reconciliation, and page-version creation by (source_id, slug) in both PGLite and Postgres engines.
Preserves back-compat: callers that omit sourceId still target source_id='default'.
Keeps empty-string source ids explicit by checking sourceId !== undefined, matching the existing regression expectations.
Extends version APIs so getVersions() and revertToVersion() can target the same source-aware identity as writes.

Human Walkthrough

If your company has two knowledge folders with the same file names, GBrain should not guess which one you meant. This change makes the folder/source name travel with the document all the way into storage, history, tags, and search chunks.

Review-Fix Pass

Latest review-fix commit: 415ee45.

Fixed Copilot review findings around:

empty-string sourceId handling;
source-scoped version history and revert;
over-broad GetPageOpts docs;
source-scoped import sync anchors.

Validation

git diff --check
bun run verify
bun test test/sync.test.ts test/import-file.test.ts test/multi-source-integration.test.ts
focused review-fix rerun: bun test test/sync.test.ts test/pglite-engine.test.ts -> 128 pass, 0 fail
prior full isolated run: HOME=$(mktemp -d) bun run test -> 3831 pass, 0 fail

Notes

The first full local test run without HOME isolation hit this machine's real ~/.gbrain, causing an unrelated BrainRegistry test to resolve the host brain instead of failing as the test expects. Re-running with a clean HOME matches CI/test isolation and passes.

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

Thread optional sourceId through sync/import/import-file and scope page, chunk, tag, and version writes by source identity instead of bare slug. Default callers still target source_id='default'. Adds regression coverage for named-source full sync and import. Validation: bun run verify; HOME=/var/folders/l7/tyvn8qfs50v57w__75s3k0f00000gp/T/tmp.UhjtVbjvxm bun run test (3831 pass, 0 fail); focused sync/import/multi-source tests.

Copilot

Pull request overview

This PR fixes multi-source sync/import behavior by threading an optional sourceId through the import/write pipeline and scoping page/chunk/tag/version writes and lookups by (source_id, slug) so named-source syncs don’t accidentally write into default.

Changes:

Thread sourceId through performSync → runImport() / importFile() and through importFromContent() / importCodeFile().
Update both Postgres and PGLite engines to scope page CRUD, chunk CRUD, tag reconciliation, and version creation by source_id.
Add sync/import regression tests to ensure named-source runs write pages (and associated chunks/tags/versions) into the requested source.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
test/sync.test.ts	Adds regression tests asserting named-source sync/import writes into the requested source, not `default`.
src/core/types.ts	Extends core types with `PageInput.source_id` and updates `GetPageOpts` documentation.
src/core/engine.ts	Extends the engine interface to accept optional `sourceId` on several write paths (notably `createVersion`).
src/core/import-file.ts	Threads `sourceId` through import logic for page/chunk/tag/version operations.
src/commands/sync.ts	Passes `sourceId` through sync deletes and import calls so sync is source-scoped.
src/commands/import.ts	Adds `sourceId` support to `runImport()` and forwards it into `importFile()`.
src/core/postgres-engine.ts	Scopes page/chunk/tag/version operations by `source_id` in the Postgres engine.
src/core/pglite-engine.ts	Scopes page/chunk/tag/version operations by `source_id` in the PGLite engine.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

100yenadmin · 2026-05-05T11:35:52Z

@copilot please re-review this PR with the latest changes. @codesmith please re-check this PR against the latest commits and refreshed PR description.

blacksmith-sh · 2026-05-05T11:35:54Z

Hi @100yenadmin! Codesmith requires write access to this repository. You currently have read-only access to garrytan/gbrain.

Removed PR garrytan#438 write-through patch. Using upstream put_page (DB-only) + dream cycle for link/timeline materialization per Garry's intended architecture. The write-through created double-write conflicts with dream's reverseWriteSlugs(). Fork now carries only PR garrytan#639 (source_id routing for sync/import). Updated PATCHES.md and validate-patches.sh accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

garrytan · 2026-05-09T05:39:45Z

Thanks for the contribution! This is being closed because it was superseded by #757 in #776 (full superset including the performFullSync gap). Real fix shipped; your name will appear in the v0.30.3 release notes attribution where applicable. Thank you again.

Copilot AI review requested due to automatic review settings May 5, 2026 10:08

100yenadmin mentioned this pull request May 5, 2026

fix: preserve source id during sync imports #541

Closed

1 task

Copilot started reviewing on behalf of 100yenadmin May 5, 2026 10:09 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

Comment thread src/commands/sync.ts Outdated

Comment thread src/commands/sync.ts Outdated

Comment thread src/core/import-file.ts Outdated

Comment thread src/core/import-file.ts Outdated

Comment thread src/core/engine.ts Outdated

Comment thread src/core/types.ts Outdated

fix: address source-aware review comments

415ee45

garrytan mentioned this pull request May 9, 2026

v0.31.1.1-fixwave fix-wave: 22 community fixes (auth-code P0, upgrade-path, sync, multi-source, privacy) #776

Merged

5 tasks

garrytan closed this May 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve source id during sync imports#639

fix: preserve source id during sync imports#639
100yenadmin wants to merge 2 commits intogarrytan:masterfrom
electricsheephq:codex/upstream-source-aware-sync

100yenadmin commented May 5, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

100yenadmin commented May 5, 2026

Uh oh!

blacksmith-sh Bot commented May 5, 2026

Uh oh!

garrytan commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

100yenadmin commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Plain-English Summary

Maintainer Notes

Human Walkthrough

Review-Fix Pass

Validation

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

100yenadmin commented May 5, 2026

Uh oh!

blacksmith-sh Bot commented May 5, 2026

Uh oh!

garrytan commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

100yenadmin commented May 5, 2026 •

edited

Loading