feat: add guardrails system with system prompt injection/override/decorator#64
Conversation
…orator Implement a pluggable guardrails pipeline that intercepts requests before they reach providers. Guardrails can be executed sequentially (chained) or in parallel (concurrent with ordered application). The pipeline wraps the router as a RoutableProvider decorator, keeping handler code unchanged. The first guardrail is a system prompt guardrail with three modes: - inject: adds a system message only if none exists - override: replaces all existing system messages - decorator: prepends configured content to the first system message Both ChatCompletion and Responses API endpoints are supported. All configuration is driven from config.yaml and environment variables. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a pluggable guardrails subsystem: config schema and types, a Guardrail interface, an ordered/parallel Pipeline executor, a GuardedProvider wrapper that applies guardrails before routing, a SystemPromptGuardrail implementation, tests, and app wiring to enable guardrails when configured. Changes
Sequence DiagramsequenceDiagram
participant Client
participant GuardedProvider
participant Pipeline
participant SystemPromptGuardrail
participant InnerProvider
Client->>GuardedProvider: ChatCompletion(req)
GuardedProvider->>Pipeline: Process(ctx, messages)
Pipeline->>SystemPromptGuardrail: Process(ctx, messages)
SystemPromptGuardrail-->>Pipeline: modified messages
Pipeline-->>GuardedProvider: processed messages
GuardedProvider->>InnerProvider: ChatCompletion(processed req)
InnerProvider-->>GuardedProvider: response
GuardedProvider-->>Client: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@config/config.go`:
- Around line 41-70: Add validation during configuration loading to reject
invalid values early: in the config Load() path validate
GuardrailsConfig.Execution accepts only "sequential" or "parallel" and validate
SystemPromptGuardrailConfig.Mode accepts only "inject", "override", or
"decorator"; on invalid values return a clear error describing the field and
allowed options so operators fail fast instead of relying on downstream checks
in pipeline construction or NewSystemPromptGuardrail.
In `@internal/guardrails/pipeline.go`:
- Around line 97-121: The processChatParallel function currently runs
Guardrail.ProcessChat concurrently but then overwrites the request by setting
current = r.req in the final loop, causing "last guardrail wins" and discarding
earlier modifications; also the original req pointer is shared across goroutines
(risk of future data races) and chatResult.index is unused. Fix by either (A)
documenting that Pipeline.processChatParallel is validation-only and will not
merge modifications and explicitly copying req before dispatch to guardrails to
avoid races, or (B) implement merge semantics: have each goroutine operate on a
deep copy of the input (copy the core.ChatRequest before calling
Guardrail.ProcessChat), then in the aggregation loop sequentially apply/merge
modifications into current in registration order (use a well-defined merge
function or call a method on core.ChatRequest to apply diffs) and remove the
unused chatResult.index field; ensure Guardrail.ProcessChat is invoked with the
copied request to prevent in-place mutations from racing.
In `@internal/guardrails/provider.go`:
- Around line 19-24: Update the NewGuardedProvider constructor to validate its
inputs: check if the incoming inner (core.RoutableProvider) or pipeline
(*Pipeline) is nil and fail fast with a clear error message (e.g., panic or
log.Fatalf) instead of returning a GuardedProvider that will later panic on
method calls; modify the NewGuardedProvider function to perform these nil checks
and produce a descriptive error referencing NewGuardedProvider, inner and
pipeline so wiring bugs fail at construction time.
In `@internal/guardrails/system_prompt.go`:
- Around line 55-67: The ProcessChat implementation in SystemPromptGuardrail
currently constructs a new core.ChatRequest by manually copying fields which
will drop future fields; instead, make a value copy of the incoming req (e.g.,
copied := *req) and then set copied.Messages = g.applyToMessages(req.Messages)
and return &copied, so only Messages is replaced while all existing and future
core.ChatRequest fields are preserved (update SystemPromptGuardrail.ProcessChat
accordingly).
- Around line 71-85: ProcessResponses currently constructs a new
core.ResponsesRequest by hand and omits any future fields, risking silent drops;
change SystemPromptGuardrail.ProcessResponses to clone the incoming req (e.g.,
make a copy via copy := *req) and then set copy.Instructions =
g.applyToInstructions(req.Instructions) before returning © so all existing
and future fields on core.ResponsesRequest are preserved; reference
SystemPromptGuardrail.ProcessResponses, core.ResponsesRequest, and
applyToInstructions when locating the change.
| // GuardrailsConfig holds configuration for the request guardrails pipeline. | ||
| type GuardrailsConfig struct { | ||
| // Enabled controls whether guardrails are active | ||
| // Default: false | ||
| Enabled bool `yaml:"enabled" env:"GUARDRAILS_ENABLED"` | ||
|
|
||
| // Execution controls how guardrails are executed: "sequential" or "parallel" | ||
| // Default: "sequential" | ||
| Execution string `yaml:"execution" env:"GUARDRAILS_EXECUTION"` | ||
|
|
||
| // SystemPrompt configures the system prompt guardrail | ||
| SystemPrompt SystemPromptGuardrailConfig `yaml:"system_prompt"` | ||
| } | ||
|
|
||
| // SystemPromptGuardrailConfig holds configuration for the system prompt guardrail. | ||
| type SystemPromptGuardrailConfig struct { | ||
| // Enabled controls whether the system prompt guardrail is active | ||
| // Default: false | ||
| Enabled bool `yaml:"enabled" env:"GUARDRAILS_SYSTEM_PROMPT_ENABLED"` | ||
|
|
||
| // Mode controls how the system prompt is applied: "inject", "override", or "decorator" | ||
| // - inject: adds a system message only if none exists | ||
| // - override: replaces all existing system messages | ||
| // - decorator: prepends to the first existing system message | ||
| // Default: "inject" | ||
| Mode string `yaml:"mode" env:"GUARDRAILS_SYSTEM_PROMPT_MODE"` | ||
|
|
||
| // Content is the system prompt text to apply | ||
| Content string `yaml:"content" env:"GUARDRAILS_SYSTEM_PROMPT_CONTENT"` | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider validating Execution and Mode at config load time.
Both Execution (line 49) and Mode (line 66) are free-form strings with constrained valid values ("sequential"/"parallel" and "inject"/"override"/"decorator" respectively). While downstream code (the pipeline and NewSystemPromptGuardrail) validates these, catching invalid values early in Load() would give operators a clearer error at startup rather than a failure during pipeline construction.
This is optional — the current approach works since construction-time validation exists.
🤖 Prompt for AI Agents
In `@config/config.go` around lines 41 - 70, Add validation during configuration
loading to reject invalid values early: in the config Load() path validate
GuardrailsConfig.Execution accepts only "sequential" or "parallel" and validate
SystemPromptGuardrailConfig.Mode accepts only "inject", "override", or
"decorator"; on invalid values return a clear error describing the field and
allowed options so operators fail fast instead of relying on downstream checks
in pipeline construction or NewSystemPromptGuardrail.
internal/guardrails/pipeline.go
Outdated
| func (p *Pipeline) processChatParallel(ctx context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | ||
| results := make([]chatResult, len(p.guardrails)) | ||
| var wg sync.WaitGroup | ||
|
|
||
| for i, g := range p.guardrails { | ||
| wg.Add(1) | ||
| go func(idx int, guardrail Guardrail) { | ||
| defer wg.Done() | ||
| modified, err := guardrail.ProcessChat(ctx, req) | ||
| results[idx] = chatResult{index: idx, req: modified, err: err} | ||
| }(i, g) | ||
| } | ||
|
|
||
| wg.Wait() | ||
|
|
||
| // Check for errors and apply modifications in order | ||
| current := req | ||
| for i, r := range results { | ||
| if r.err != nil { | ||
| return nil, fmt.Errorf("guardrail %q: %w", p.guardrails[i].Name(), r.err) | ||
| } | ||
| current = r.req | ||
| } | ||
| return current, nil | ||
| } |
There was a problem hiding this comment.
Parallel mode: "last guardrail wins" rather than merging modifications in order.
The loop on Lines 114-119 simply replaces current with each successive result, meaning only the last guardrail's output survives. The doc comment says "Modifications are applied sequentially in registration order" but in practice, each guardrail runs on the original request and the final loop just picks the last result — earlier guardrails' modifications are silently discarded.
This is fine when parallel mode is used purely for validation (pass/reject with no modifications), but will silently lose changes if multiple guardrails modify the request. Consider either:
- Documenting this limitation clearly (parallel mode is for validation-only guardrails), or
- Implementing actual merge semantics.
Additionally, the shared req pointer passed to concurrent goroutines (Line 105) is safe today because the system prompt guardrail creates a new request rather than mutating in-place, but future guardrails that mutate the input would cause a data race. Consider documenting this contract or defensively copying req before dispatching.
Minor: unused `index` field
The index field in chatResult (Line 89) and responsesResult (Line 125) is set but never read — the loop uses the slice index i directly.
type chatResult struct {
- index int
req *core.ChatRequest
err error
} type responsesResult struct {
- index int
req *core.ResponsesRequest
err error
}And in the goroutines:
- results[idx] = chatResult{index: idx, req: modified, err: err}
+ results[idx] = chatResult{req: modified, err: err}🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 97 - 121, The
processChatParallel function currently runs Guardrail.ProcessChat concurrently
but then overwrites the request by setting current = r.req in the final loop,
causing "last guardrail wins" and discarding earlier modifications; also the
original req pointer is shared across goroutines (risk of future data races) and
chatResult.index is unused. Fix by either (A) documenting that
Pipeline.processChatParallel is validation-only and will not merge modifications
and explicitly copying req before dispatch to guardrails to avoid races, or (B)
implement merge semantics: have each goroutine operate on a deep copy of the
input (copy the core.ChatRequest before calling Guardrail.ProcessChat), then in
the aggregation loop sequentially apply/merge modifications into current in
registration order (use a well-defined merge function or call a method on
core.ChatRequest to apply diffs) and remove the unused chatResult.index field;
ensure Guardrail.ProcessChat is invoked with the copied request to prevent
in-place mutations from racing.
| func NewGuardedProvider(inner core.RoutableProvider, pipeline *Pipeline) *GuardedProvider { | ||
| return &GuardedProvider{ | ||
| inner: inner, | ||
| pipeline: pipeline, | ||
| } | ||
| } |
There was a problem hiding this comment.
Add nil guards in the constructor.
If inner or pipeline is nil (e.g., due to a wiring bug in app setup), every method will panic with a nil-pointer dereference at runtime. A defensive check here provides a clear error at construction time.
🛡️ Proposed fix
func NewGuardedProvider(inner core.RoutableProvider, pipeline *Pipeline) *GuardedProvider {
+ if inner == nil {
+ panic("guardrails: inner provider must not be nil")
+ }
+ if pipeline == nil {
+ panic("guardrails: pipeline must not be nil")
+ }
return &GuardedProvider{
inner: inner,
pipeline: pipeline,
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func NewGuardedProvider(inner core.RoutableProvider, pipeline *Pipeline) *GuardedProvider { | |
| return &GuardedProvider{ | |
| inner: inner, | |
| pipeline: pipeline, | |
| } | |
| } | |
| func NewGuardedProvider(inner core.RoutableProvider, pipeline *Pipeline) *GuardedProvider { | |
| if inner == nil { | |
| panic("guardrails: inner provider must not be nil") | |
| } | |
| if pipeline == nil { | |
| panic("guardrails: pipeline must not be nil") | |
| } | |
| return &GuardedProvider{ | |
| inner: inner, | |
| pipeline: pipeline, | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@internal/guardrails/provider.go` around lines 19 - 24, Update the
NewGuardedProvider constructor to validate its inputs: check if the incoming
inner (core.RoutableProvider) or pipeline (*Pipeline) is nil and fail fast with
a clear error message (e.g., panic or log.Fatalf) instead of returning a
GuardedProvider that will later panic on method calls; modify the
NewGuardedProvider function to perform these nil checks and produce a
descriptive error referencing NewGuardedProvider, inner and pipeline so wiring
bugs fail at construction time.
internal/guardrails/system_prompt.go
Outdated
| func (g *SystemPromptGuardrail) ProcessResponses(ctx context.Context, req *core.ResponsesRequest) (*core.ResponsesRequest, error) { | ||
| instructions := g.applyToInstructions(req.Instructions) | ||
| return &core.ResponsesRequest{ | ||
| Model: req.Model, | ||
| Input: req.Input, | ||
| Instructions: instructions, | ||
| Tools: req.Tools, | ||
| Temperature: req.Temperature, | ||
| MaxOutputTokens: req.MaxOutputTokens, | ||
| Stream: req.Stream, | ||
| StreamOptions: req.StreamOptions, | ||
| Metadata: req.Metadata, | ||
| Reasoning: req.Reasoning, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
Same fragile copy pattern for ResponsesRequest — apply the same fix.
Identical concern as ProcessChat: any new field on core.ResponsesRequest will be silently dropped.
♻️ Proposed fix
func (g *SystemPromptGuardrail) ProcessResponses(ctx context.Context, req *core.ResponsesRequest) (*core.ResponsesRequest, error) {
instructions := g.applyToInstructions(req.Instructions)
- return &core.ResponsesRequest{
- Model: req.Model,
- Input: req.Input,
- Instructions: instructions,
- Tools: req.Tools,
- Temperature: req.Temperature,
- MaxOutputTokens: req.MaxOutputTokens,
- Stream: req.Stream,
- StreamOptions: req.StreamOptions,
- Metadata: req.Metadata,
- Reasoning: req.Reasoning,
- }, nil
+ modified := *req
+ modified.Instructions = instructions
+ return &modified, nil
}🤖 Prompt for AI Agents
In `@internal/guardrails/system_prompt.go` around lines 71 - 85, ProcessResponses
currently constructs a new core.ResponsesRequest by hand and omits any future
fields, risking silent drops; change SystemPromptGuardrail.ProcessResponses to
clone the incoming req (e.g., make a copy via copy := *req) and then set
copy.Instructions = g.applyToInstructions(req.Instructions) before returning
© so all existing and future fields on core.ResponsesRequest are preserved;
reference SystemPromptGuardrail.ProcessResponses, core.ResponsesRequest, and
applyToInstructions when locating the change.
…d grouping Instead of a global "sequential" or "parallel" execution mode, each guardrail now has an "order" field. Guardrails with the same order value run in parallel (concurrently), and groups execute sequentially in ascending order. Single-entry groups skip goroutine overhead. This enables mixed execution patterns like: order 0: [guardrail A, guardrail B] → run in parallel order 1: [guardrail C] → runs after group 0 order 2: [guardrail D, guardrail E] → run in parallel after C Adds concurrency test proving same-order guardrails truly run in parallel using atomic counters and barriers. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/app/app.go (1)
142-148:⚠️ Potential issue | 🟡 Minor
Router()returns the unwrapped provider, bypassing guardrails when enabled.When guardrails are configured,
server.New()(line 137) receives the guarded provider from the chain built at lines 111-125. However,Router()returnsa.providers.Routerdirectly, unguarded. Any caller usingRouter()will skip the guardrails pipeline. If this is intentional for internal/admin use, add a doc comment. Otherwise, store the guarded provider and return it here instead.
🤖 Fix all issues with AI agents
In `@internal/guardrails/pipeline_test.go`:
- Around line 151-197: The test TestPipeline_MixedOrders_GroupsExecuteCorrectly
uses a shared slice variable trace that is appended to from a guardrail running
in a parallel group (mockGuardrail named "A"), which is race-prone; change the
test to avoid unsynchronized concurrent writes by protecting trace with a
sync.Mutex (lock/unlock around each append) or by switching to a channel-based
collector (have guardrails send their name into a buffered channel and drain it
after p.ProcessChat), and update the mockGuardrail closures (A and any future
parallel guardrails) to use the chosen synchronized mechanism; alternatively,
add a clear comment on the trace variable that concurrent writes must be
synchronized to prevent future races.
- Around line 109-149: The test TestPipeline_SameOrder_RunInParallel currently
spin-waits on started.Load() with no timeout and can hang; modify the wait after
launching ProcessChat to use a bounded wait (e.g., a select polling loop or
select on a time.After) so that if started does not reach 2 within a short
timeout the test fails instead of blocking indefinitely; update the waiting
logic that references started and barrier (and related done) to include a
timeout branch (use time.After) and call t.Fatal or break the test when the
timeout elapses so the outer select on done can still run.
In `@internal/guardrails/pipeline.go`:
- Around line 130-189: Both runChatGroupParallel and runResponsesGroupParallel
duplicate the same concurrency pattern; consolidate into a single generic helper
that launches goroutines, waits, checks errors, and returns the last successful
modification. Create a generic function (e.g., runGroupParallel[T any]) that
accepts the group []entry, initial request *T, and a processor func(ctx
context.Context, g Guardrail, req *T) (*T, error) (used to call
Guardrail.ProcessChat or Guardrail.ProcessResponses), allocate a results slice
of the generic result type, run the goroutines filling results, wait, then
iterate results to return the first error or last successful modified *T;
replace runChatGroupParallel and runResponsesGroupParallel to call this helper
and remove chatResult/responsesResult duplication.
In `@internal/guardrails/provider_test.go`:
- Around line 62-63: Tests call NewSystemPromptGuardrail and discard its error
return (e.g., g, _ := NewSystemPromptGuardrail(SystemPromptInject, "guardrail
system")); update each test to capture the error (g, err :=
NewSystemPromptGuardrail(...)) and assert it’s nil before proceeding — use the
test helper available in the package (e.g., require.NoError(t, err)) or a simple
if err != nil { t.Fatalf("NewSystemPromptGuardrail failed: %v", err) } so test
setup fails loudly if the constructor starts returning errors; apply the same
change for every occurrence (lines referencing NewSystemPromptGuardrail,
SystemPromptInject, pipeline.Add).
| func TestPipeline_SameOrder_RunInParallel(t *testing.T) { | ||
| p := NewPipeline() | ||
|
|
||
| var started atomic.Int32 | ||
| barrier := make(chan struct{}) | ||
|
|
||
| // Both at order 0 — they should run concurrently | ||
| for i := range 2 { | ||
| name := fmt.Sprintf("parallel_%d", i) | ||
| p.Add(&mockGuardrail{ | ||
| name: name, | ||
| chatFn: func(_ context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | ||
| started.Add(1) | ||
| <-barrier // wait until both have started | ||
| return req, nil | ||
| }, | ||
| }, 0) | ||
| } | ||
|
|
||
| req := &core.ChatRequest{Model: "gpt-4"} | ||
|
|
||
| done := make(chan struct{}) | ||
| go func() { | ||
| _, _ = p.ProcessChat(context.Background(), req) | ||
| close(done) | ||
| }() | ||
|
|
||
| // Wait for both goroutines to start | ||
| for started.Load() < 2 { | ||
| time.Sleep(time.Millisecond) | ||
| } | ||
| // Both started concurrently — release them | ||
| close(barrier) | ||
|
|
||
| select { | ||
| case <-done: | ||
| // success | ||
| case <-time.After(2 * time.Second): | ||
| t.Fatal("timed out — guardrails did not run in parallel") | ||
| } | ||
| } |
There was a problem hiding this comment.
Unbounded spin-wait can hang the test if parallelism regresses.
The for started.Load() < 2 loop at Line 137 has no timeout. If the implementation changes to run the group sequentially, the first goroutine blocks on barrier, the second never starts, and this loop spins forever — the 2-second select on Line 143 is never reached because the main goroutine is stuck.
Consider adding a timeout to the spin-wait:
Suggested fix
- // Wait for both goroutines to start
- for started.Load() < 2 {
- time.Sleep(time.Millisecond)
- }
+ // Wait for both goroutines to start (with timeout)
+ deadline := time.After(2 * time.Second)
+ for started.Load() < 2 {
+ select {
+ case <-deadline:
+ t.Fatal("timed out waiting for both guardrails to start — not running in parallel")
+ default:
+ time.Sleep(time.Millisecond)
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func TestPipeline_SameOrder_RunInParallel(t *testing.T) { | |
| p := NewPipeline() | |
| var started atomic.Int32 | |
| barrier := make(chan struct{}) | |
| // Both at order 0 — they should run concurrently | |
| for i := range 2 { | |
| name := fmt.Sprintf("parallel_%d", i) | |
| p.Add(&mockGuardrail{ | |
| name: name, | |
| chatFn: func(_ context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | |
| started.Add(1) | |
| <-barrier // wait until both have started | |
| return req, nil | |
| }, | |
| }, 0) | |
| } | |
| req := &core.ChatRequest{Model: "gpt-4"} | |
| done := make(chan struct{}) | |
| go func() { | |
| _, _ = p.ProcessChat(context.Background(), req) | |
| close(done) | |
| }() | |
| // Wait for both goroutines to start | |
| for started.Load() < 2 { | |
| time.Sleep(time.Millisecond) | |
| } | |
| // Both started concurrently — release them | |
| close(barrier) | |
| select { | |
| case <-done: | |
| // success | |
| case <-time.After(2 * time.Second): | |
| t.Fatal("timed out — guardrails did not run in parallel") | |
| } | |
| } | |
| func TestPipeline_SameOrder_RunInParallel(t *testing.T) { | |
| p := NewPipeline() | |
| var started atomic.Int32 | |
| barrier := make(chan struct{}) | |
| // Both at order 0 — they should run concurrently | |
| for i := range 2 { | |
| name := fmt.Sprintf("parallel_%d", i) | |
| p.Add(&mockGuardrail{ | |
| name: name, | |
| chatFn: func(_ context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | |
| started.Add(1) | |
| <-barrier // wait until both have started | |
| return req, nil | |
| }, | |
| }, 0) | |
| } | |
| req := &core.ChatRequest{Model: "gpt-4"} | |
| done := make(chan struct{}) | |
| go func() { | |
| _, _ = p.ProcessChat(context.Background(), req) | |
| close(done) | |
| }() | |
| // Wait for both goroutines to start (with timeout) | |
| deadline := time.After(2 * time.Second) | |
| for started.Load() < 2 { | |
| select { | |
| case <-deadline: | |
| t.Fatal("timed out waiting for both guardrails to start — not running in parallel") | |
| default: | |
| time.Sleep(time.Millisecond) | |
| } | |
| } | |
| // Both started concurrently — release them | |
| close(barrier) | |
| select { | |
| case <-done: | |
| // success | |
| case <-time.After(2 * time.Second): | |
| t.Fatal("timed out — guardrails did not run in parallel") | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline_test.go` around lines 109 - 149, The test
TestPipeline_SameOrder_RunInParallel currently spin-waits on started.Load() with
no timeout and can hang; modify the wait after launching ProcessChat to use a
bounded wait (e.g., a select polling loop or select on a time.After) so that if
started does not reach 2 within a short timeout the test fails instead of
blocking indefinitely; update the waiting logic that references started and
barrier (and related done) to include a timeout branch (use time.After) and call
t.Fatal or break the test when the timeout elapses so the outer select on done
can still run.
| func TestPipeline_MixedOrders_GroupsExecuteCorrectly(t *testing.T) { | ||
| p := NewPipeline() | ||
|
|
||
| var trace []string | ||
|
|
||
| // Order 0, guardrail A | ||
| p.Add(&mockGuardrail{ | ||
| name: "A", | ||
| chatFn: func(_ context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | ||
| trace = append(trace, "A") | ||
| return req, nil | ||
| }, | ||
| }, 0) | ||
|
|
||
| // Order 1, guardrail B (runs after group 0 completes) | ||
| p.Add(&mockGuardrail{ | ||
| name: "B", | ||
| chatFn: func(_ context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | ||
| trace = append(trace, "B") | ||
| return req, nil | ||
| }, | ||
| }, 1) | ||
|
|
||
| // Order 0, guardrail C (parallel with A) | ||
| p.Add(&mockGuardrail{ | ||
| name: "C", | ||
| chatFn: func(_ context.Context, req *core.ChatRequest) (*core.ChatRequest, error) { | ||
| // Note: this writes to trace inside a parallel group, so we can't assert | ||
| // exact ordering of A and C. But B must come after both. | ||
| return req, nil | ||
| }, | ||
| }, 0) | ||
|
|
||
| req := &core.ChatRequest{Model: "gpt-4"} | ||
| _, err := p.ProcessChat(context.Background(), req) | ||
| if err != nil { | ||
| t.Fatal(err) | ||
| } | ||
|
|
||
| // B must be last because it's order 1 | ||
| if len(trace) < 2 { | ||
| t.Fatalf("expected at least 2 trace entries, got %d", len(trace)) | ||
| } | ||
| if trace[len(trace)-1] != "B" { | ||
| t.Errorf("expected B to run last (order 1), got trace: %v", trace) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Race-prone pattern: trace slice shared across a parallel group.
Guardrail C intentionally avoids appending to trace (good!), but guardrail A does append at Line 160 inside the parallel group. This works today because only one writer exists, but it's fragile — anyone adding a trace write to C would introduce a data race not caught by the compiler. Consider using an atomic or channel-based approach, or adding a comment warning future maintainers.
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline_test.go` around lines 151 - 197, The test
TestPipeline_MixedOrders_GroupsExecuteCorrectly uses a shared slice variable
trace that is appended to from a guardrail running in a parallel group
(mockGuardrail named "A"), which is race-prone; change the test to avoid
unsynchronized concurrent writes by protecting trace with a sync.Mutex
(lock/unlock around each append) or by switching to a channel-based collector
(have guardrails send their name into a buffered channel and drain it after
p.ProcessChat), and update the mockGuardrail closures (A and any future parallel
guardrails) to use the chosen synchronized mechanism; alternatively, add a clear
comment on the trace variable that concurrent writes must be synchronized to
prevent future races.
internal/guardrails/pipeline.go
Outdated
| // runChatGroupParallel runs all guardrails in a group concurrently on the same input. | ||
| // If any returns an error, the group fails. Modifications are applied | ||
| // in registration order (slice order) after all complete. | ||
| func runChatGroupParallel(ctx context.Context, group []entry, req *core.ChatRequest) (*core.ChatRequest, error) { | ||
| results := make([]chatResult, len(group)) | ||
| var wg sync.WaitGroup | ||
|
|
||
| for i, e := range group { | ||
| wg.Add(1) | ||
| go func(idx int, g Guardrail) { | ||
| defer wg.Done() | ||
| modified, err := g.ProcessChat(ctx, req) | ||
| results[idx] = chatResult{req: modified, err: err} | ||
| }(i, e.guardrail) | ||
| } | ||
|
|
||
| wg.Wait() | ||
|
|
||
| // Check for errors and take last successful modification (registration order) | ||
| current := req | ||
| for i, r := range results { | ||
| if r.err != nil { | ||
| return nil, fmt.Errorf("guardrail %q: %w", group[i].guardrail.Name(), r.err) | ||
| } | ||
| current = r.req | ||
| } | ||
| return current, nil | ||
| } | ||
|
|
||
| // responsesResult holds the result of a parallel guardrail execution for responses. | ||
| type responsesResult struct { | ||
| req *core.ResponsesRequest | ||
| err error | ||
| } | ||
|
|
||
| // runResponsesGroupParallel runs all guardrails in a group concurrently on the same input. | ||
| func runResponsesGroupParallel(ctx context.Context, group []entry, req *core.ResponsesRequest) (*core.ResponsesRequest, error) { | ||
| results := make([]responsesResult, len(group)) | ||
| var wg sync.WaitGroup | ||
|
|
||
| for i, e := range group { | ||
| wg.Add(1) | ||
| go func(idx int, g Guardrail) { | ||
| defer wg.Done() | ||
| modified, err := g.ProcessResponses(ctx, req) | ||
| results[idx] = responsesResult{req: modified, err: err} | ||
| }(i, e.guardrail) | ||
| } | ||
|
|
||
| wg.Wait() | ||
|
|
||
| current := req | ||
| for i, r := range results { | ||
| if r.err != nil { | ||
| return nil, fmt.Errorf("guardrail %q: %w", group[i].guardrail.Name(), r.err) | ||
| } | ||
| current = r.req | ||
| } | ||
| return current, nil | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Duplication between runChatGroupParallel and runResponsesGroupParallel.
These two functions follow an identical pattern (launch goroutines → wait → iterate results → return last or first error). With Go generics, this could be consolidated into a single helper, reducing the maintenance surface as more request types are added.
This is optional — the current approach is readable and there are only two variants.
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 130 - 189, Both
runChatGroupParallel and runResponsesGroupParallel duplicate the same
concurrency pattern; consolidate into a single generic helper that launches
goroutines, waits, checks errors, and returns the last successful modification.
Create a generic function (e.g., runGroupParallel[T any]) that accepts the group
[]entry, initial request *T, and a processor func(ctx context.Context, g
Guardrail, req *T) (*T, error) (used to call Guardrail.ProcessChat or
Guardrail.ProcessResponses), allocate a results slice of the generic result
type, run the goroutines filling results, wait, then iterate results to return
the first error or last successful modified *T; replace runChatGroupParallel and
runResponsesGroupParallel to call this helper and remove
chatResult/responsesResult duplication.
internal/guardrails/provider_test.go
Outdated
| g, _ := NewSystemPromptGuardrail(SystemPromptInject, "guardrail system") | ||
| pipeline.Add(g, 0) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider checking errors from NewSystemPromptGuardrail in tests.
The error return from NewSystemPromptGuardrail is discarded with _ in multiple tests (lines 62, 93, 121, 142). While the inputs are known-valid, checking the error avoids silently broken test setups if the constructor contract changes.
Example fix
- g, _ := NewSystemPromptGuardrail(SystemPromptInject, "guardrail system")
+ g, err := NewSystemPromptGuardrail(SystemPromptInject, "guardrail system")
+ if err != nil {
+ t.Fatal(err)
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| g, _ := NewSystemPromptGuardrail(SystemPromptInject, "guardrail system") | |
| pipeline.Add(g, 0) | |
| g, err := NewSystemPromptGuardrail(SystemPromptInject, "guardrail system") | |
| if err != nil { | |
| t.Fatal(err) | |
| } | |
| pipeline.Add(g, 0) |
🤖 Prompt for AI Agents
In `@internal/guardrails/provider_test.go` around lines 62 - 63, Tests call
NewSystemPromptGuardrail and discard its error return (e.g., g, _ :=
NewSystemPromptGuardrail(SystemPromptInject, "guardrail system")); update each
test to capture the error (g, err := NewSystemPromptGuardrail(...)) and assert
it’s nil before proceeding — use the test helper available in the package (e.g.,
require.NoError(t, err)) or a simple if err != nil {
t.Fatalf("NewSystemPromptGuardrail failed: %v", err) } so test setup fails
loudly if the constructor starts returning errors; apply the same change for
every occurrence (lines referencing NewSystemPromptGuardrail,
SystemPromptInject, pipeline.Add).
Replace the single-struct guardrail config with a list of named rule
instances. Each rule has a name, type, order, and type-specific settings.
Multiple instances of the same type (e.g. two system_prompt guardrails
with different content) are fully supported.
SystemPromptGuardrail now takes a name parameter so each instance is
independently identifiable in logs and error messages.
Config example:
guardrails:
enabled: true
rules:
- name: "safety-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "decorator"
content: "Always be safe."
- name: "compliance-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "inject"
content: "Follow compliance rules."
https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Go strings natively support unicode, so guardrail names already accept spaces, Cyrillic, CJK, accented Latin, emoji, etc. Added explicit test coverage and config examples demonstrating this. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@internal/guardrails/system_prompt_test.go`:
- Around line 46-48: Replace discarded constructor errors for
NewSystemPromptGuardrail in the tests with a helper that fails fast: add a
mustNewSystemPromptGuardrail(t, name, mode, content) test helper which calls
NewSystemPromptGuardrail, checks err, calls t.Fatalf with a descriptive message
on error, and returns the guardrail; then replace all usages that currently do
"g, _ := NewSystemPromptGuardrail(...)" (e.g., in
TestSystemPrompt_Inject_NoExistingSystem and the other listed tests) with calls
to mustNewSystemPromptGuardrail so tests fail with a clear error instead of
panicking on nil.
- Around line 24-34: Refactor the table-driven tests
TestNewSystemPromptGuardrail_ValidModes and
TestNewSystemPromptGuardrail_UnicodeNames to use t.Run for each case: iterate
modes/names and call t.Run with a descriptive subtest name (e.g.,
fmt.Sprintf("%v", mode) or the Unicode string), then inside the subtest call
NewSystemPromptGuardrail and assert results (use t.Fatalf/t.Fatalf-variant for
setup errors and t.Errorf for assertions) referencing the existing symbols
SystemPromptMode, NewSystemPromptGuardrail and g.Name() so each case runs as an
isolated subtest and can be filtered/run individually.
| func TestNewSystemPromptGuardrail_ValidModes(t *testing.T) { | ||
| for _, mode := range []SystemPromptMode{SystemPromptInject, SystemPromptOverride, SystemPromptDecorator} { | ||
| g, err := NewSystemPromptGuardrail("my-guardrail", mode, "test") | ||
| if err != nil { | ||
| t.Fatalf("unexpected error for mode %q: %v", mode, err) | ||
| } | ||
| if g.Name() != "my-guardrail" { | ||
| t.Errorf("expected name 'my-guardrail', got %q", g.Name()) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use t.Run for table-driven subtests.
Both TestNewSystemPromptGuardrail_ValidModes and TestNewSystemPromptGuardrail_UnicodeNames iterate over cases but don't use t.Run. Without subtests, a failure only shows the loop index context you manually include, you can't run/filter individual cases with -run, and the first t.Fatalf in ValidModes aborts the remaining modes.
Example for UnicodeNames
for _, tc := range tests {
+ t.Run(tc.name, func(t *testing.T) {
g, err := NewSystemPromptGuardrail(tc.name, SystemPromptInject, "content")
if err != nil {
- t.Errorf("unexpected error for name %q: %v", tc.name, err)
- continue
+ t.Fatalf("unexpected error: %v", err)
}
if g.Name() != tc.name {
t.Errorf("expected name %q, got %q", tc.name, g.Name())
}
+ })
}Also applies to: 322-343
🤖 Prompt for AI Agents
In `@internal/guardrails/system_prompt_test.go` around lines 24 - 34, Refactor the
table-driven tests TestNewSystemPromptGuardrail_ValidModes and
TestNewSystemPromptGuardrail_UnicodeNames to use t.Run for each case: iterate
modes/names and call t.Run with a descriptive subtest name (e.g.,
fmt.Sprintf("%v", mode) or the Unicode string), then inside the subtest call
NewSystemPromptGuardrail and assert results (use t.Fatalf/t.Fatalf-variant for
setup errors and t.Errorf for assertions) referencing the existing symbols
SystemPromptMode, NewSystemPromptGuardrail and g.Name() so each case runs as an
isolated subtest and can be filtered/run individually.
| func TestSystemPrompt_Inject_NoExistingSystem(t *testing.T) { | ||
| g, _ := NewSystemPromptGuardrail("test", SystemPromptInject, "injected system prompt") | ||
| req := &core.ChatRequest{ |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Discarded constructor errors will cause nil-pointer panics instead of clear failures.
Throughout the file, NewSystemPromptGuardrail errors are silently discarded with _. If a regression causes the constructor to return an error, g will be nil and the test panics with an opaque nil-dereference instead of a clear diagnostic. Consider a small helper:
Proposed fix
Add a test helper at the top of the file and use it in place of the bare _ discard:
func mustNewSystemPromptGuardrail(t *testing.T, name string, mode SystemPromptMode, content string) *SystemPromptGuardrail {
t.Helper()
g, err := NewSystemPromptGuardrail(name, mode, content)
if err != nil {
t.Fatalf("NewSystemPromptGuardrail(%q, %q, %q): unexpected error: %v", name, mode, content, err)
}
return g
}Then replace e.g.:
- g, _ := NewSystemPromptGuardrail("test", SystemPromptInject, "injected system prompt")
+ g := mustNewSystemPromptGuardrail(t, "test", SystemPromptInject, "injected system prompt")Also applies to: 71-72, 94-95, 116-117, 144-145, 166-167, 190-191, 214-215, 240-241, 253-254, 266-267, 279-280, 293-294, 306-307, 345-346
🤖 Prompt for AI Agents
In `@internal/guardrails/system_prompt_test.go` around lines 46 - 48, Replace
discarded constructor errors for NewSystemPromptGuardrail in the tests with a
helper that fails fast: add a mustNewSystemPromptGuardrail(t, name, mode,
content) test helper which calls NewSystemPromptGuardrail, checks err, calls
t.Fatalf with a descriptive message on error, and returns the guardrail; then
replace all usages that currently do "g, _ := NewSystemPromptGuardrail(...)"
(e.g., in TestSystemPrompt_Inject_NoExistingSystem and the other listed tests)
with calls to mustNewSystemPromptGuardrail so tests fail with a clear error
instead of panicking on nil.
Guardrails now operate on a normalized []Message instead of concrete ChatRequest/ResponsesRequest types. Adapters in GuardedProvider handle conversion between API-specific requests and the message DTO. This eliminates the N×M coupling (N guardrails × M endpoint types): - Guardrail interface: single Process(ctx, []Message) method - Pipeline: single Process method (was duplicated for Chat/Responses) - SystemPromptGuardrail: one implementation (was two parallel paths) - New endpoint types only need an adapter pair, no guardrail changes https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@internal/guardrails/provider.go`:
- Around line 132-147: The current applyMessagesToResponses function
concatenates all system Message entries into a single Instructions string which
can be surprising given Instructions is a single-field semantic; change
applyMessagesToResponses (working on *core.ResponsesRequest) to take only the
first system message instead of joining multiple ones: iterate msgs, when you
find the first Message with Role == "system" set result.Instructions to that
Message.Content and break out of the loop (do not append others), and update any
related comment/tests to reflect the single-message behavior; ensure you
reference applyMessagesToResponses, core.ResponsesRequest, and the Instructions
field when making the change.
| // applyMessagesToResponses returns a shallow copy of req with system messages | ||
| // applied back to the Instructions field. | ||
| func applyMessagesToResponses(req *core.ResponsesRequest, msgs []Message) *core.ResponsesRequest { | ||
| result := *req | ||
| var instructions string | ||
| for _, m := range msgs { | ||
| if m.Role == "system" { | ||
| if instructions != "" { | ||
| instructions += "\n" | ||
| } | ||
| instructions += m.Content | ||
| } | ||
| } | ||
| result.Instructions = instructions | ||
| return &result | ||
| } |
There was a problem hiding this comment.
Multiple system messages are concatenated into a single Instructions string.
applyMessagesToResponses joins all system messages with \n into the Instructions field (Lines 137-144). If a guardrail produces multiple system messages (e.g., inject adds one while an existing one is preserved), they get concatenated. This may be surprising — consider documenting this behavior or taking only the first system message to match the single-field semantics of Instructions.
🤖 Prompt for AI Agents
In `@internal/guardrails/provider.go` around lines 132 - 147, The current
applyMessagesToResponses function concatenates all system Message entries into a
single Instructions string which can be surprising given Instructions is a
single-field semantic; change applyMessagesToResponses (working on
*core.ResponsesRequest) to take only the first system message instead of joining
multiple ones: iterate msgs, when you find the first Message with Role ==
"system" set result.Instructions to that Message.Content and break out of the
loop (do not append others), and update any related comment/tests to reflect the
single-message behavior; ensure you reference applyMessagesToResponses,
core.ResponsesRequest, and the Instructions field when making the change.
Implement a pluggable guardrails pipeline that intercepts requests before
they reach providers. Guardrails can be executed sequentially (chained) or
in parallel (concurrent with ordered application). The pipeline wraps the
router as a RoutableProvider decorator, keeping handler code unchanged.
The first guardrail is a system prompt guardrail with three modes:
Both ChatCompletion and Responses API endpoints are supported. All
configuration is driven from config.yaml and environment variables.
https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Summary by CodeRabbit
New Features
Configuration
Tests