fix(pi): add attribution to all insertEvent calls + session_meta fallback#605
Merged
Merged
Conversation
…back
Three bugs fixed:
1. P0 — pi/extension.ts: All 4 insertEvent() calls omitted the 4th
attribution parameter, causing project_dir to default to empty string
for every event. This leaks data across projects in shared SessionDB
instances because searchEvents uses WHERE project_dir = ?.
Fix: create a _attribution object from resolvePiWorkspaceDir() and
pass it to all insertEvent calls (PostToolUse × 3, UserPromptSubmit × 1).
2. P1 — session/db.ts: insertEvent fallback chain ended at empty string
when no attribution was provided. Added _getSessionProjectDir(sessionId)
that queries session_meta for the session's project_dir as a
last-resort fallback, preventing orphan events with project_dir=''.
Applied to both insertEvent and bulkInsertEvents.
3. P1 — session/db.ts: searchEvents SQL used WHERE project_dir = ?
which excludes historical orphan events (project_dir='') from
search results. Changed to WHERE (project_dir = ? OR project_dir = '')
so searches include both properly-attributed and legacy orphan events.
Impact: Without these fixes, searchEvents('query', 50, '/my-project')
would match ALL projects' orphan events via project_dir='', causing
cross-project data leakage.
mksglu
added a commit
that referenced
this pull request
May 31, 2026
…tore Adds the cross-DB plumbing required by the ctx_search `project:` filter (#737) without changing any existing call site. Three layers gain an opt-in scope hook: - `ContentStore.searchWithFallback` accepts `sessionIdAllowSet?: Set<string>`. When supplied, the RRF candidate pool is fetched at 8x the requested limit and post-filtered by `chunks.session_id` membership. Legacy unattributed chunks (`session_id=''`) stay visible — they predate the attribution wiring landed in 2d4f7c1 (#605) and represent shared knowledge surface that must remain reachable in shared-DB mode. - `SessionDB.getSessionIdsForProject(projectDir)` returns the distinct session ids whose events match a `project_dir`. Backed by the composite index `idx_session_events_project(session_id, project_dir)` introduced alongside the project_dir column in 270a56f (#325), so 1000-session lookups stay sub-50ms. - `searchAllSources` gains `projectScope?: string | null`. When a string is passed AND a `sessionDB` is available, the resolver looks up the allow-set once and threads it into `store.searchWithFallback`. The three-state contract (undefined / null / string) matches the resolver surfaced in the next commit so the handler and the library agree on semantics. `SearchResult.sessionId` is added to the public type so the post-filter has the attribution column it needs; the new field is `?: string` and defaults to `""` for legacy chunks. The eight FTS5 prepared statements gain the `chunks.session_id` / `chunks_trigram.session_id` column so `#mapSearchRows` can populate it. ATTACH DATABASE is intentionally NOT used — the SQLite docs warn that WAL mode plus ATTACH carries durability trade-offs that the unified storage layer should not inherit. The two-step IN-clause keeps SessionDB and ContentStore in their own connections, which also keeps the search-only path read-only against the events DB. Refs ead9177 (#367 — searchAllSources unification), 270a56f (#325 — session_events.project_dir column + idx_session_events_project index).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix: Cross-project data leakage via missing attribution in pi adapter
Root Cause
The pi adapter (
src/adapters/pi/extension.ts) callsinsertEvent()4 times but omits the 4th parameterattributionon every call. This means all 4 call sites fall through to the default chain:Result: every event gets
project_dir = "". WhensearchEventsrunsWHERE project_dir = ?with a real project path, it excludes those orphans. But when it runs withproject_dir = ""(e.g. from a session that somehow resolves to empty), it matches all projects' orphan events — a cross-project data leak.Three Fixes
P0: extension.ts — Pass attribution to all insertEvent calls
Added
_attributionobject withprojectDir,source: "workspace_root",confidence: 0.98and passed it as the 4th argument to all 4insertEvent()calls. This is the source-level fix: new events will always carry the correct project directory.P1: db.ts — insertEvent fallback via _getSessionProjectDir
Added
_getSessionProjectDir(sessionId)method that queriessession_metafor the session'sproject_dir. Inserted into the fallback chain of bothinsertEvent()andbulkInsertEvents():