feat(guardrails): vendor-neutral content guardrail seams by garrytan-agents · Pull Request #1652 · garrytan/gbrain

garrytan-agents · 2026-05-30T14:15:21Z

What

Exposes vendor-neutral content guardrail seams at the five boundaries where external content enters GBrain's retrieval layer and where queries/tool-inputs enter the LLM gateway. Lets a content firewall (prompt-injection / RAG-poison detector, PII scrubber, etc.) be hooked in without binding GBrain to any specific vendor.

OSS ships inert — zero guardrails registered by default, every seam is a no-op until an operator registers a provider.

Why

Content poisoning (a malicious page, a booby-trapped tweet) only becomes dangerous at the moment it's ingested into the retrieval layer and made searchable. That import boundary — plus the gateway's own LLM calls — is the right place to let an external classifier observe. Rather than wire one vendor into core, this adds a generic seam that any guardrail backend implements against.

The seam

New module src/core/guardrails.ts:

runGuardrails({ hook, content, metadata }): Promise<void>
registerGuardrailProvider / unregisterGuardrailProvider
hasGuardrails() fast-path guard for hot paths

Hard invariants (enforced by `test/guardrails.test.ts`)

Observe-only — returns void; callers never branch on a verdict. Cannot block/rewrite/drop/retry. Enforcement, if ever added, gets its own RFC-gated seam.
Fail open — provider throw/reject/timeout/network-error is swallowed; a broken guardrail never breaks an ingest, query, or tool call.
Inline await — provider sees content at the exact pre-persist / pre-inference moment.
No verdict persistence — providers own their own audit trail.
Content boundaries — passes only the ingest/user payload; never system prompts, full history, tool output, LLM output, embeddings, or multimodal/OCR/rerank.

Hooks

`hook`	Location	Fires
`file_storage.markdown`	`importFromContent`	after parse + size guard, before sanity/hash/chunk/embed/write
`file_storage.code`	`importCodeFile`	after size guard, before hash/chunk/embed/write
`ai_gateway.chat`	`chat`	latest user message only, before inference
`ai_gateway.expand`	`expand`	query, before expansion model call
`ai_gateway.tool_input`	`toolLoop`	`{toolName, input}`, before pending-persist + execution

Tests

test/guardrails.test.ts — 14 tests: inert-by-default, register/unregister/idempotent, fail-open isolation, inline-await, verdict-ignored, empty/blank short-circuit, content+metadata pass-through.
Existing import-file.test.ts + import-file-content-sanity.test.ts — 40 tests still green (hot path undisturbed).
tsc --noEmit clean across the repo.

Docs

docs/guardrails.md — contract, seam table, provider-authoring guide.

Scope / non-goals

No enforcement. This is the observe seam only.
No bundled vendor. A guardrail provider lives in its own package and registers at init.

Expose observe-only guardrail seams at the five boundaries where external content enters the retrieval layer and the LLM gateway, so a content firewall (prompt-injection / RAG-poison detector, PII scrubber, etc.) can be hooked in without binding GBrain to any specific vendor. New module src/core/guardrails.ts: - runGuardrails({ hook, content, metadata }) -> void - registerGuardrailProvider / unregisterGuardrailProvider - hasGuardrails() fast-path guard for hot paths Seams (all observe-only, fail-open, inline-await, inert by default): - file_storage.markdown (import-file.ts importFromContent) - file_storage.code (import-file.ts importCodeFile) - ai_gateway.chat (gateway.ts chat, last user message only) - ai_gateway.expand (gateway.ts expand) - ai_gateway.tool_input (gateway.ts toolLoop, before pending-persist) Invariants enforced by test/guardrails.test.ts (14 tests): - returns void; callers never branch on a verdict - provider throw/reject is swallowed (fail-open isolation) - slow async provider is awaited before resolving (inline) - zero providers => no-op; empty/blank content short-circuits - content + metadata passed through unmutated; idempotent by id Hooks pass only the ingest/user-facing payload (md/code body, last user message, expansion query, tool input). Never system prompts, full history, tool output, LLM output, embeddings, or multimodal payloads. Docs: docs/guardrails.md (contract, seam table, provider authoring guide). OSS ships inert; vendors register a provider in their own package.

garrytan-agents · 2026-05-30T14:49:59Z

✅ Pre-merge review gate

Ran the GStack /review critical pass + tests before marking ready.

Tests: test/guardrails.test.ts — 14/14 pass. Existing import-file.test.ts + import-file-content-sanity.test.ts — 40/40 pass (hot path undisturbed). tsc --noEmit clean across the repo.

/review critical pass (5 categories):

SQL & Data Safety — ✅ clean. No SQL in diff; seam writes nothing to DB/vector store (verdicts explicitly not persisted).
Race Conditions & Concurrency — ✅ clean. providers Map is snapshotted via Array.from() before Promise.all iteration, so mid-flight register/unregister can't mutate the iteration set. No find-or-create / check-then-write.
LLM Output Trust Boundary — ✅ clean, and on-point: this seam observes inbound content pre-persist; it never consumes LLM output, and the provider return value is typed unknown and ignored. No path where a verdict influences a DB write, mailer, or fetch. This is the observation point for the stored-prompt-injection class, not a new instance of it.
Shell Injection — ✅ clean. No exec/eval/subprocess/shell.
Enum & Value Completeness — ✅ clean. New GuardrailHook union (5 values); traced every consumer — each value emitted by exactly one seam caller, no switch/case consumes it (providers get it as opaque metadata), so no exhaustiveness gap.

Informational: redundant hasGuardrails() check in the gateway wrapper is intentional (skips the message-array walk on every chat call in the common zero-guardrail case). Import is used. No slop.

Codex review: attempted codex exec review; hung in this container's read-only sandbox (no bubblewrap) without emitting findings — environment limitation, not a code signal. /review is the authoritative gate here.

Verdict: ready to pick up.

garrytan · 2026-05-30T16:38:54Z

Superseded by #1660. Rebased into a base-repo branch (garrytan/guardrails-seam) so CI gets secret access per the garrytan-agents workflow in CLAUDE.md, and folded in the v0.41.34.0 release bookkeeping (VERSION + CHANGELOG). Your feature commit was cherry-picked verbatim — authorship preserved, code byte-identical. Verified on the new branch: bun run verify 29/29 green, 14/14 guardrail tests, import-file hot path undisturbed. Thank you for this — the five-seam shape is exactly right, and it closes the injection-via-filesystem gap. Closing this one.

…supersedes #1652) (#1660) * feat(guardrails): vendor-neutral content guardrail seams Expose observe-only guardrail seams at the five boundaries where external content enters the retrieval layer and the LLM gateway, so a content firewall (prompt-injection / RAG-poison detector, PII scrubber, etc.) can be hooked in without binding GBrain to any specific vendor. New module src/core/guardrails.ts: - runGuardrails({ hook, content, metadata }) -> void - registerGuardrailProvider / unregisterGuardrailProvider - hasGuardrails() fast-path guard for hot paths Seams (all observe-only, fail-open, inline-await, inert by default): - file_storage.markdown (import-file.ts importFromContent) - file_storage.code (import-file.ts importCodeFile) - ai_gateway.chat (gateway.ts chat, last user message only) - ai_gateway.expand (gateway.ts expand) - ai_gateway.tool_input (gateway.ts toolLoop, before pending-persist) Invariants enforced by test/guardrails.test.ts (14 tests): - returns void; callers never branch on a verdict - provider throw/reject is swallowed (fail-open isolation) - slow async provider is awaited before resolving (inline) - zero providers => no-op; empty/blank content short-circuits - content + metadata passed through unmutated; idempotent by id Hooks pass only the ingest/user-facing payload (md/code body, last user message, expansion query, tool input). Never system prompts, full history, tool output, LLM output, embeddings, or multimodal payloads. Docs: docs/guardrails.md (contract, seam table, provider authoring guide). OSS ships inert; vendors register a provider in their own package. * chore: bump version and changelog (v0.41.35.0) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: garrytan-agents <agent@garrytan.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* upstream/master: v0.41.36.0 feat(mcp): publish agent skills (list_skills / get_skill) for thin clients (garrytan#1661) v0.41.35.0 feat(guardrails): vendor-neutral content guardrail seams (supersedes garrytan#1652) (garrytan#1660) v0.41.34.0 feat(search): retrieval cathedral — max-pool + title + alias + evidence (garrytan#1657) v0.41.33.0 feat(search): intent-aware adaptive return-sizing + agent-facing query param (garrytan#1640) v0.41.32.0 fix(staleness): commit-relative sync staleness (supersedes garrytan#1623) (garrytan#1656) v0.41.31.0 feat(embed): delta-aware sync --all cost gate + real stale-embedding semantics (garrytan#1632) v0.41.30.0 fix(brainstorm/lsd): --save writes the advertised .md file via canonical ingestion path (garrytan#1655) # Conflicts: # src/core/operations.ts

garrytan mentioned this pull request May 30, 2026

v0.41.35.0 feat(guardrails): vendor-neutral content guardrail seams (supersedes #1652) #1660

Merged

4 tasks

garrytan closed this May 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(guardrails): vendor-neutral content guardrail seams#1652

feat(guardrails): vendor-neutral content guardrail seams#1652
garrytan-agents wants to merge 1 commit into
garrytan:masterfrom
garrytan-agents:feat/guardrail-seams

garrytan-agents commented May 30, 2026

Uh oh!

garrytan-agents commented May 30, 2026

Uh oh!

garrytan commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

garrytan-agents commented May 30, 2026

What

Why

The seam

Hard invariants (enforced by test/guardrails.test.ts)

Hooks

Tests

Docs

Scope / non-goals

Uh oh!

garrytan-agents commented May 30, 2026

✅ Pre-merge review gate

Uh oh!

garrytan commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Hard invariants (enforced by `test/guardrails.test.ts`)