Conversation
…orator Implement a pluggable guardrails pipeline that intercepts requests before they reach providers. Guardrails can be executed sequentially (chained) or in parallel (concurrent with ordered application). The pipeline wraps the router as a RoutableProvider decorator, keeping handler code unchanged. The first guardrail is a system prompt guardrail with three modes: - inject: adds a system message only if none exists - override: replaces all existing system messages - decorator: prepends configured content to the first system message Both ChatCompletion and Responses API endpoints are supported. All configuration is driven from config.yaml and environment variables. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
…d grouping Instead of a global "sequential" or "parallel" execution mode, each guardrail now has an "order" field. Guardrails with the same order value run in parallel (concurrently), and groups execute sequentially in ascending order. Single-entry groups skip goroutine overhead. This enables mixed execution patterns like: order 0: [guardrail A, guardrail B] → run in parallel order 1: [guardrail C] → runs after group 0 order 2: [guardrail D, guardrail E] → run in parallel after C Adds concurrency test proving same-order guardrails truly run in parallel using atomic counters and barriers. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Replace the single-struct guardrail config with a list of named rule
instances. Each rule has a name, type, order, and type-specific settings.
Multiple instances of the same type (e.g. two system_prompt guardrails
with different content) are fully supported.
SystemPromptGuardrail now takes a name parameter so each instance is
independently identifiable in logs and error messages.
Config example:
guardrails:
enabled: true
rules:
- name: "safety-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "decorator"
content: "Always be safe."
- name: "compliance-prompt"
type: "system_prompt"
order: 0
system_prompt:
mode: "inject"
content: "Follow compliance rules."
https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Go strings natively support unicode, so guardrail names already accept spaces, Cyrillic, CJK, accented Latin, emoji, etc. Added explicit test coverage and config examples demonstrating this. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Guardrails now operate on a normalized []Message instead of concrete ChatRequest/ResponsesRequest types. Adapters in GuardedProvider handle conversion between API-specific requests and the message DTO. This eliminates the N×M coupling (N guardrails × M endpoint types): - Guardrail interface: single Process(ctx, []Message) method - Pipeline: single Process method (was duplicated for Chat/Responses) - SystemPromptGuardrail: one implementation (was two parallel paths) - New endpoint types only need an adapter pair, no guardrail changes https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Mintlify-based docs covering guardrails overview, configuration, execution order, system_prompt modes (inject/override/decorator), and examples for parallel and sequential pipelines. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughIntroduces a pluggable guardrails system enabling request validation and modification via a configurable pipeline. Includes message normalization, a parallel/sequential execution engine, system-prompt guardrail implementation, request adapters, and full application integration with comprehensive test coverage. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant GuardedProvider
participant Pipeline
participant Guardrails
participant RoutableProvider as RoutableProvider<br/>(LLM)
Client->>GuardedProvider: ChatCompletion(request)
activate GuardedProvider
GuardedProvider->>GuardedProvider: chatToMessages(request)
GuardedProvider->>Pipeline: Process(ctx, messages)
activate Pipeline
Pipeline->>Pipeline: Group guardrails by order
loop Sequential Groups
Pipeline->>Guardrails: Execute parallel guardrails<br/>for current group
activate Guardrails
Guardrails->>Guardrails: Process messages<br/>(inject/override/decorate)
Guardrails-->>Pipeline: Modified messages
deactivate Guardrails
Pipeline->>Pipeline: Use modified messages<br/>for next group
end
Pipeline-->>GuardedProvider: Final messages or error
deactivate Pipeline
alt Guardrail Error
GuardedProvider-->>Client: Reject request (400/403)
else Success
GuardedProvider->>GuardedProvider: applyMessagesToChat(messages)
GuardedProvider->>RoutableProvider: ChatCompletion(modified_request)
activate RoutableProvider
RoutableProvider-->>GuardedProvider: ChatResponse
deactivate RoutableProvider
GuardedProvider-->>Client: ChatResponse
end
deactivate GuardedProvider
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/app/app.go (1)
138-184:⚠️ Potential issue | 🟡 MinorRace condition in
TestPipeline_MixedOrders_GroupsExecuteCorrectlytrace slice.In
pipeline_test.golines 138-184, guardrails A and C both run at order 0 (parallel). Guardrail A appends to the sharedtraceslice (line 147) without synchronization. While C avoids writing, if the pipeline implementation runs both concurrently, even A's append alongside any concurrent read oftraceis a data race. The test passes by luck because the assertion only checks afterProcessreturns.However, if you run tests with
-race, this could flag depending on pipeline internals. The comment on line 165-166 acknowledges this but the test still appends from A inside a parallel group.
🤖 Fix all issues with AI agents
In `@config/config.go`:
- Around line 54-69: The GuardrailRuleConfig currently embeds
SystemPromptSettings directly (SystemPrompt field) which is a flat-union pattern
that will accumulate dead fields as new guardrail types are added; update the
comment above the GuardrailRuleConfig struct to explicitly document this
trade-off and intended extensibility (mention that SystemPrompt is only used
when Type == "system_prompt", that other types may add their own settings or a
true tagged union could be introduced later) so future maintainers understand
why the flat pattern was chosen and how to evolve it.
In `@docs/advanced/guardrails.mdx`:
- Around line 194-215: Update the "Multiple Guardrails in Parallel" section
(header "Multiple Guardrails in Parallel") to state the merge strategy: clarify
that when multiple guardrails share the same order they run concurrently on the
same input but only the last guardrail's output (by registration/order in the
rules list, e.g. "safety-prompt" vs "compliance-prompt") is forwarded to the
next pipeline stage; earlier guardrail modifications are discarded. Add a short
note next to the YAML example explaining this behavior and, if helpful,
recommend reordering or using different orders to compose cumulative
modifications.
In `@internal/guardrails/pipeline.go`:
- Around line 30-34: Pipeline.Add is not safe for concurrent use; update the API
contract by adding a clear doc comment to the Pipeline type and/or the Add
method stating that Add must be called only during setup (before calling
Pipeline.Process) and is not goroutine-safe because it appends to the entries
slice. Reference the Pipeline struct, the Add(g Guardrail, order int) method,
and the entries field in the comment so callers know to build the pipeline
single-threaded (or alternately protect entries with a sync.Mutex if you want
concurrent adds).
- Around line 110-117: The loop launching parallel guardrails should use a
derived cancellable context so other goroutines stop when one fails: before the
for loop create newCtx, cancel := context.WithCancel(ctx) and defer cancel();
pass newCtx (not the outer ctx) into g.Process; after writing results[idx] if
err != nil call cancel() to propagate cancellation; keep wg and result storage
as-is but ensure you reference the loop variables (group, g.Process, results,
wg, ctx/result) correctly to avoid closure capture issues.
- Around line 103-129: runGroupParallel currently runs guardrails in parallel
and then overwrites outputs so only the last guardrail's modifications survive;
to fix, make runGroupParallel apply guardrails sequentially in registration
order by calling each entry.guardrail.Process(ctx, currentMsgs) one after
another (feeding the previous output into the next) instead of launching
goroutines, check and immediately return on any error from Guardrail.Process,
and update uses of results/result to reflect the sequential flow (use a single
currentMsgs variable rather than collecting concurrent results). Ensure you
reference runGroupParallel, entry.guardrail (or Guardrail), Guardrail.Process,
the msgs/current variable, and the result handling when implementing this
change.
In `@internal/guardrails/provider_test.go`:
- Around line 60-89: In TestGuardedProvider_ChatCompletion_AppliesGuardrails
(and the other test sites calling NewSystemPromptGuardrail), stop discarding the
returned error: capture the two returns (g, err) from NewSystemPromptGuardrail
and call t.Fatal(err) (or t.Fatalf with context) if err != nil so the test fails
fast on unexpected validation errors; update each occurrence that currently uses
g, _ to use g, err and handle the error accordingly.
In `@internal/guardrails/provider.go`:
- Around line 134-147: The current applyMessagesToResponses function merges
multiple system-role Messages into a single ResponsesRequest.Instructions string
with "\n", which is lossy; update applyMessagesToResponses to join system
messages using an explicit, easily-parsable separator (e.g. "\n\n---\n\n")
instead of a single newline and add a short code comment above
applyMessagesToResponses documenting this concatenation semantic so callers know
multiple guardrail outputs are preserved and distinguishable; reference the
applyMessagesToResponses function and the ResponsesRequest.Instructions field
when making the change.
| // GuardrailRuleConfig defines a single guardrail instance. | ||
| type GuardrailRuleConfig struct { | ||
| // Name is a unique identifier for this guardrail instance (used in logs and errors) | ||
| Name string `yaml:"name"` | ||
|
|
||
| // Type selects the guardrail implementation: "system_prompt" | ||
| Type string `yaml:"type"` | ||
|
|
||
| // Order controls execution ordering relative to other guardrails. | ||
| // Guardrails with the same order run in parallel; different orders run sequentially. | ||
| // Default: 0 | ||
| Order int `yaml:"order"` | ||
|
|
||
| // SystemPrompt holds settings when Type is "system_prompt" | ||
| SystemPrompt SystemPromptSettings `yaml:"system_prompt"` | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Flat union config pattern is acceptable for now but consider documenting extensibility.
GuardrailRuleConfig embeds SystemPrompt settings directly regardless of Type. This works well with a single type but will accumulate dead fields as more guardrail types are added. Consider adding a brief comment noting this trade-off.
🤖 Prompt for AI Agents
In `@config/config.go` around lines 54 - 69, The GuardrailRuleConfig currently
embeds SystemPromptSettings directly (SystemPrompt field) which is a flat-union
pattern that will accumulate dead fields as new guardrail types are added;
update the comment above the GuardrailRuleConfig struct to explicitly document
this trade-off and intended extensibility (mention that SystemPrompt is only
used when Type == "system_prompt", that other types may add their own settings
or a true tagged union could be introduced later) so future maintainers
understand why the flat pattern was chosen and how to evolve it.
| // Add appends a guardrail with the given execution order. | ||
| // Guardrails with the same order run in parallel; different orders run sequentially. | ||
| func (p *Pipeline) Add(g Guardrail, order int) { | ||
| p.entries = append(p.entries, entry{guardrail: g, order: order}) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Pipeline.Add is not safe for concurrent use.
If Add could be called from multiple goroutines during setup (e.g., dynamically loaded config), the slice append is racy. This is likely fine if the pipeline is built once at startup, but a brief doc comment clarifying that Add must be called before Process would prevent misuse.
Clarify concurrency contract
// Add appends a guardrail with the given execution order.
// Guardrails with the same order run in parallel; different orders run sequentially.
+// Add must not be called concurrently or after Process has been called.
func (p *Pipeline) Add(g Guardrail, order int) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Add appends a guardrail with the given execution order. | |
| // Guardrails with the same order run in parallel; different orders run sequentially. | |
| func (p *Pipeline) Add(g Guardrail, order int) { | |
| p.entries = append(p.entries, entry{guardrail: g, order: order}) | |
| } | |
| // Add appends a guardrail with the given execution order. | |
| // Guardrails with the same order run in parallel; different orders run sequentially. | |
| // Add must not be called concurrently or after Process has been called. | |
| func (p *Pipeline) Add(g Guardrail, order int) { | |
| p.entries = append(p.entries, entry{guardrail: g, order: order}) | |
| } |
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 30 - 34, Pipeline.Add is not
safe for concurrent use; update the API contract by adding a clear doc comment
to the Pipeline type and/or the Add method stating that Add must be called only
during setup (before calling Pipeline.Process) and is not goroutine-safe because
it appends to the entries slice. Reference the Pipeline struct, the Add(g
Guardrail, order int) method, and the entries field in the comment so callers
know to build the pipeline single-threaded (or alternately protect entries with
a sync.Mutex if you want concurrent adds).
| // runGroupParallel runs all guardrails in a group concurrently on the same input. | ||
| // If any returns an error, the group fails. Modifications are applied | ||
| // in registration order (slice order) after all complete. | ||
| func runGroupParallel(ctx context.Context, group []entry, msgs []Message) ([]Message, error) { | ||
| results := make([]result, len(group)) | ||
| var wg sync.WaitGroup | ||
|
|
||
| for i, e := range group { | ||
| wg.Add(1) | ||
| go func(idx int, g Guardrail) { | ||
| defer wg.Done() | ||
| modified, err := g.Process(ctx, msgs) | ||
| results[idx] = result{msgs: modified, err: err} | ||
| }(i, e.guardrail) | ||
| } | ||
|
|
||
| wg.Wait() | ||
|
|
||
| // Check for errors and take last successful modification (registration order) | ||
| current := msgs | ||
| for i, r := range results { | ||
| if r.err != nil { | ||
| return nil, fmt.Errorf("guardrail %q: %w", group[i].guardrail.Name(), r.err) | ||
| } | ||
| current = r.msgs | ||
| } | ||
| return current, nil |
There was a problem hiding this comment.
Parallel group execution silently discards all modifications except the last guardrail's.
All guardrails in a parallel group receive the same input msgs, run concurrently, and then at lines 122-128 the results are iterated in order — but each current = r.msgs simply overwrites the previous result. Only the last guardrail's output survives. Earlier guardrails' modifications are silently discarded.
The doc comment at lines 104-105 says "Modifications are applied in registration order" which implies a merge/chain, but the actual behavior is "last writer wins." This is a meaningful distinction that could cause silent loss of guardrail effects when users configure multiple guardrails at the same order.
Consider one of:
- Chain sequentially within the group — feed each guardrail's output as the next one's input (true "apply in order").
- Document the "last wins" semantic explicitly and warn users that only validation/error guardrails are useful for non-last entries in a group.
- Implement a merge strategy that combines modifications from all parallel guardrails.
Option 1: Chain results sequentially within the group (preserves all modifications)
func runGroupParallel(ctx context.Context, group []entry, msgs []Message) ([]Message, error) {
results := make([]result, len(group))
var wg sync.WaitGroup
for i, e := range group {
wg.Add(1)
go func(idx int, g Guardrail) {
defer wg.Done()
modified, err := g.Process(ctx, msgs)
results[idx] = result{msgs: modified, err: err}
}(i, e.guardrail)
}
wg.Wait()
- // Check for errors and take last successful modification (registration order)
- current := msgs
+ // Check for errors first
for i, r := range results {
if r.err != nil {
return nil, fmt.Errorf("guardrail %q: %w", group[i].guardrail.Name(), r.err)
}
- current = r.msgs
}
+
+ // Chain results: feed each guardrail's output as next input
+ current := msgs
+ for _, r := range results {
+ current = r.msgs
+ }
return current, nil
}Note: Even Option 1 still has the problem that each guardrail ran on the original msgs, not on the previous guardrail's output. If true chaining is desired, the goroutines should run sequentially or a different merge strategy is needed. The current parallel design fundamentally means guardrails don't see each other's changes.
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 103 - 129, runGroupParallel
currently runs guardrails in parallel and then overwrites outputs so only the
last guardrail's modifications survive; to fix, make runGroupParallel apply
guardrails sequentially in registration order by calling each
entry.guardrail.Process(ctx, currentMsgs) one after another (feeding the
previous output into the next) instead of launching goroutines, check and
immediately return on any error from Guardrail.Process, and update uses of
results/result to reflect the sequential flow (use a single currentMsgs variable
rather than collecting concurrent results). Ensure you reference
runGroupParallel, entry.guardrail (or Guardrail), Guardrail.Process, the
msgs/current variable, and the result handling when implementing this change.
| for i, e := range group { | ||
| wg.Add(1) | ||
| go func(idx int, g Guardrail) { | ||
| defer wg.Done() | ||
| modified, err := g.Process(ctx, msgs) | ||
| results[idx] = result{msgs: modified, err: err} | ||
| }(i, e.guardrail) | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
No cancellation propagation on parallel guardrail failure.
When one goroutine's guardrail returns an error, the other goroutines continue running to completion (they all share the same ctx but nothing cancels it). For long-running or external guardrails, consider deriving a cancellable context and cancelling it on the first error to avoid wasted work.
Proposed fix using a derived context
func runGroupParallel(ctx context.Context, group []entry, msgs []Message) ([]Message, error) {
+ ctx, cancel := context.WithCancel(ctx)
+ defer cancel()
+
results := make([]result, len(group))
var wg sync.WaitGroup
for i, e := range group {
wg.Add(1)
go func(idx int, g Guardrail) {
defer wg.Done()
modified, err := g.Process(ctx, msgs)
+ if err != nil {
+ cancel()
+ }
results[idx] = result{msgs: modified, err: err}
}(i, e.guardrail)
}🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 110 - 117, The loop launching
parallel guardrails should use a derived cancellable context so other goroutines
stop when one fails: before the for loop create newCtx, cancel :=
context.WithCancel(ctx) and defer cancel(); pass newCtx (not the outer ctx) into
g.Process; after writing results[idx] if err != nil call cancel() to propagate
cancellation; keep wg and result storage as-is but ensure you reference the loop
variables (group, g.Process, results, wg, ctx/result) correctly to avoid closure
capture issues.
| func TestGuardedProvider_ChatCompletion_AppliesGuardrails(t *testing.T) { | ||
| inner := &mockRoutableProvider{} | ||
| pipeline := NewPipeline() | ||
|
|
||
| g, _ := NewSystemPromptGuardrail("test", SystemPromptInject, "guardrail system") | ||
| pipeline.Add(g, 0) | ||
|
|
||
| guarded := NewGuardedProvider(inner, pipeline) | ||
|
|
||
| req := &core.ChatRequest{ | ||
| Model: "gpt-4", | ||
| Messages: []core.Message{{Role: "user", Content: "hello"}}, | ||
| } | ||
|
|
||
| _, err := guarded.ChatCompletion(context.Background(), req) | ||
| if err != nil { | ||
| t.Fatal(err) | ||
| } | ||
|
|
||
| // Verify the inner provider received the modified request | ||
| if inner.chatReq == nil { | ||
| t.Fatal("inner provider was not called") | ||
| } | ||
| if len(inner.chatReq.Messages) != 2 { | ||
| t.Fatalf("expected 2 messages, got %d", len(inner.chatReq.Messages)) | ||
| } | ||
| if inner.chatReq.Messages[0].Role != "system" || inner.chatReq.Messages[0].Content != "guardrail system" { | ||
| t.Errorf("expected injected system message, got %+v", inner.chatReq.Messages[0]) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Minor: Consider checking the error from NewSystemPromptGuardrail.
Throughout the test file (lines 64, 95, 123, 168, 189, 215, 265, 290, 315), the error return from NewSystemPromptGuardrail is silently discarded with _. While the inputs are known-valid constants, a t.Fatal on error would make tests more robust against future refactors that might change validation rules.
Example fix (apply similarly to other call sites)
- g, _ := NewSystemPromptGuardrail("test", SystemPromptInject, "guardrail system")
+ g, err := NewSystemPromptGuardrail("test", SystemPromptInject, "guardrail system")
+ if err != nil {
+ t.Fatal(err)
+ }🤖 Prompt for AI Agents
In `@internal/guardrails/provider_test.go` around lines 60 - 89, In
TestGuardedProvider_ChatCompletion_AppliesGuardrails (and the other test sites
calling NewSystemPromptGuardrail), stop discarding the returned error: capture
the two returns (g, err) from NewSystemPromptGuardrail and call t.Fatal(err) (or
t.Fatalf with context) if err != nil so the test fails fast on unexpected
validation errors; update each occurrence that currently uses g, _ to use g, err
and handle the error accordingly.
| func applyMessagesToResponses(req *core.ResponsesRequest, msgs []Message) *core.ResponsesRequest { | ||
| result := *req | ||
| var instructions string | ||
| for _, m := range msgs { | ||
| if m.Role == "system" { | ||
| if instructions != "" { | ||
| instructions += "\n" | ||
| } | ||
| instructions += m.Content | ||
| } | ||
| } | ||
| result.Instructions = instructions | ||
| return &result | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Note: multiple system messages in guardrail output are concatenated with newlines for Responses.
applyMessagesToResponses joins all system-role messages with \n into a single Instructions string. This works but is a lossy operation — if a guardrail pipeline produces multiple system messages (e.g., from multiple guardrails in different groups), they're merged with no delimiter distinction. Ensure this concatenation semantic is intentional and documented.
🤖 Prompt for AI Agents
In `@internal/guardrails/provider.go` around lines 134 - 147, The current
applyMessagesToResponses function merges multiple system-role Messages into a
single ResponsesRequest.Instructions string with "\n", which is lossy; update
applyMessagesToResponses to join system messages using an explicit,
easily-parsable separator (e.g. "\n\n---\n\n") instead of a single newline and
add a short code comment above applyMessagesToResponses documenting this
concatenation semantic so callers know multiple guardrail outputs are preserved
and distinguishable; reference the applyMessagesToResponses function and the
ResponsesRequest.Instructions field when making the change.
Keep ASCII as HTML comments for readability when charts don't render. https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Summary by CodeRabbit
New Features
Documentation