Skip to content

Docs: Add guardrails docs#70

Merged
SantiagoDePolonia merged 10 commits intomainfrom
claude/add-guardrails-docs-nu1RR
Feb 15, 2026
Merged

Docs: Add guardrails docs#70
SantiagoDePolonia merged 10 commits intomainfrom
claude/add-guardrails-docs-nu1RR

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Feb 15, 2026

Summary by CodeRabbit

  • New Features

    • Introduced a guardrails system to inspect and modify requests before processing through a configurable pipeline with parallel and sequential execution options.
    • Added system_prompt guardrail type with three modes: inject (add system message if missing), override (replace system messages), and decorator (prepend content to existing system message).
  • Documentation

    • Added comprehensive guardrails documentation with configuration examples, Quick Start guide, and detailed usage patterns.

…orator

Implement a pluggable guardrails pipeline that intercepts requests before
they reach providers. Guardrails can be executed sequentially (chained) or
in parallel (concurrent with ordered application). The pipeline wraps the
router as a RoutableProvider decorator, keeping handler code unchanged.

The first guardrail is a system prompt guardrail with three modes:
- inject: adds a system message only if none exists
- override: replaces all existing system messages
- decorator: prepends configured content to the first system message

Both ChatCompletion and Responses API endpoints are supported. All
configuration is driven from config.yaml and environment variables.

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
…d grouping

Instead of a global "sequential" or "parallel" execution mode, each
guardrail now has an "order" field. Guardrails with the same order
value run in parallel (concurrently), and groups execute sequentially
in ascending order. Single-entry groups skip goroutine overhead.

This enables mixed execution patterns like:
  order 0: [guardrail A, guardrail B] → run in parallel
  order 1: [guardrail C]              → runs after group 0
  order 2: [guardrail D, guardrail E] → run in parallel after C

Adds concurrency test proving same-order guardrails truly run in
parallel using atomic counters and barriers.

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Replace the single-struct guardrail config with a list of named rule
instances. Each rule has a name, type, order, and type-specific settings.
Multiple instances of the same type (e.g. two system_prompt guardrails
with different content) are fully supported.

SystemPromptGuardrail now takes a name parameter so each instance is
independently identifiable in logs and error messages.

Config example:
  guardrails:
    enabled: true
    rules:
      - name: "safety-prompt"
        type: "system_prompt"
        order: 0
        system_prompt:
          mode: "decorator"
          content: "Always be safe."
      - name: "compliance-prompt"
        type: "system_prompt"
        order: 0
        system_prompt:
          mode: "inject"
          content: "Follow compliance rules."

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Go strings natively support unicode, so guardrail names already accept
spaces, Cyrillic, CJK, accented Latin, emoji, etc. Added explicit test
coverage and config examples demonstrating this.

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Guardrails now operate on a normalized []Message instead of concrete
ChatRequest/ResponsesRequest types. Adapters in GuardedProvider handle
conversion between API-specific requests and the message DTO.

This eliminates the N×M coupling (N guardrails × M endpoint types):
- Guardrail interface: single Process(ctx, []Message) method
- Pipeline: single Process method (was duplicated for Chat/Responses)
- SystemPromptGuardrail: one implementation (was two parallel paths)
- New endpoint types only need an adapter pair, no guardrail changes

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
Mintlify-based docs covering guardrails overview, configuration,
execution order, system_prompt modes (inject/override/decorator),
and examples for parallel and sequential pipelines.

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
@SantiagoDePolonia SantiagoDePolonia self-assigned this Feb 15, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 15, 2026

Warning

Rate limit exceeded

@SantiagoDePolonia has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 1 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Introduces a pluggable guardrails system enabling request validation and modification via a configurable pipeline. Includes message normalization, a parallel/sequential execution engine, system-prompt guardrail implementation, request adapters, and full application integration with comprehensive test coverage.

Changes

Cohort / File(s) Summary
Configuration
config/config.example.yaml, config/config.go
Added GuardrailsConfig, GuardrailRuleConfig, and SystemPromptSettings types; enabled declarative guardrails configuration with rules array supporting name, type, order, and mode fields.
Documentation
docs/advanced/guardrails.mdx, docs/docs.json
Added comprehensive guardrails guide covering quick start, pipeline architecture, execution order semantics, configuration reference, type definitions, and practical examples; updated navigation index.
Core Guardrails Framework
internal/guardrails/guardrails.go, internal/guardrails/pipeline.go, internal/guardrails/system_prompt.go
Introduced Guardrail interface and Message DTO for normalized request representation; implemented Pipeline with sequential group processing and intra-group parallelism; added SystemPromptGuardrail with inject, override, and decorator modes.
Request Adapter Layer
internal/guardrails/provider.go
Created GuardedProvider wrapper implementing RoutableProvider interface; converts ChatCompletion and Responses request types to/from normalized Message format; applies pipeline before delegating to inner provider.
Application Integration
internal/app/app.go
Wired guardrails initialization into startup flow; added buildGuardrailsPipeline and buildGuardrail helpers to construct pipeline from configuration; conditionally wraps provider with GuardedProvider when enabled.
Test Coverage
internal/guardrails/pipeline_test.go, internal/guardrails/provider_test.go, internal/guardrails/system_prompt_test.go
Added 374 + 485 + 245 lines of tests covering pipeline sequencing, parallel execution, error handling, message transformation, guardrail processing modes, and request adapter conversions.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant GuardedProvider
    participant Pipeline
    participant Guardrails
    participant RoutableProvider as RoutableProvider<br/>(LLM)
    
    Client->>GuardedProvider: ChatCompletion(request)
    activate GuardedProvider
    GuardedProvider->>GuardedProvider: chatToMessages(request)
    GuardedProvider->>Pipeline: Process(ctx, messages)
    activate Pipeline
    Pipeline->>Pipeline: Group guardrails by order
    
    loop Sequential Groups
        Pipeline->>Guardrails: Execute parallel guardrails<br/>for current group
        activate Guardrails
        Guardrails->>Guardrails: Process messages<br/>(inject/override/decorate)
        Guardrails-->>Pipeline: Modified messages
        deactivate Guardrails
        Pipeline->>Pipeline: Use modified messages<br/>for next group
    end
    
    Pipeline-->>GuardedProvider: Final messages or error
    deactivate Pipeline
    
    alt Guardrail Error
        GuardedProvider-->>Client: Reject request (400/403)
    else Success
        GuardedProvider->>GuardedProvider: applyMessagesToChat(messages)
        GuardedProvider->>RoutableProvider: ChatCompletion(modified_request)
        activate RoutableProvider
        RoutableProvider-->>GuardedProvider: ChatResponse
        deactivate RoutableProvider
        GuardedProvider-->>Client: ChatResponse
    end
    deactivate GuardedProvider
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 Whiskers twitch with guardrail glee,
A pipeline flows so sequentially—
Parallel hops through orders bright,
Messages dance left and right,
Safety prompts with decorator's art,
Guard the requests from the very start! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 3
❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Docs: Add guardrails docs' is misleading; the PR includes substantial guardrails implementation code (config, app integration, pipeline, provider, system_prompt) alongside documentation. Update the title to reflect the full scope: e.g., 'feat: Implement guardrails system with config, pipeline, provider, and docs' or 'feat: Add guardrails feature with documentation'.
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (3 files):

⚔️ .github/workflows/release.yml (content)
⚔️ .github/workflows/test.yml (content)
⚔️ docs/docs.json (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/add-guardrails-docs-nu1RR

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/app/app.go (1)

138-184: ⚠️ Potential issue | 🟡 Minor

Race condition in TestPipeline_MixedOrders_GroupsExecuteCorrectly trace slice.

In pipeline_test.go lines 138-184, guardrails A and C both run at order 0 (parallel). Guardrail A appends to the shared trace slice (line 147) without synchronization. While C avoids writing, if the pipeline implementation runs both concurrently, even A's append alongside any concurrent read of trace is a data race. The test passes by luck because the assertion only checks after Process returns.

However, if you run tests with -race, this could flag depending on pipeline internals. The comment on line 165-166 acknowledges this but the test still appends from A inside a parallel group.

🤖 Fix all issues with AI agents
In `@config/config.go`:
- Around line 54-69: The GuardrailRuleConfig currently embeds
SystemPromptSettings directly (SystemPrompt field) which is a flat-union pattern
that will accumulate dead fields as new guardrail types are added; update the
comment above the GuardrailRuleConfig struct to explicitly document this
trade-off and intended extensibility (mention that SystemPrompt is only used
when Type == "system_prompt", that other types may add their own settings or a
true tagged union could be introduced later) so future maintainers understand
why the flat pattern was chosen and how to evolve it.

In `@docs/advanced/guardrails.mdx`:
- Around line 194-215: Update the "Multiple Guardrails in Parallel" section
(header "Multiple Guardrails in Parallel") to state the merge strategy: clarify
that when multiple guardrails share the same order they run concurrently on the
same input but only the last guardrail's output (by registration/order in the
rules list, e.g. "safety-prompt" vs "compliance-prompt") is forwarded to the
next pipeline stage; earlier guardrail modifications are discarded. Add a short
note next to the YAML example explaining this behavior and, if helpful,
recommend reordering or using different orders to compose cumulative
modifications.

In `@internal/guardrails/pipeline.go`:
- Around line 30-34: Pipeline.Add is not safe for concurrent use; update the API
contract by adding a clear doc comment to the Pipeline type and/or the Add
method stating that Add must be called only during setup (before calling
Pipeline.Process) and is not goroutine-safe because it appends to the entries
slice. Reference the Pipeline struct, the Add(g Guardrail, order int) method,
and the entries field in the comment so callers know to build the pipeline
single-threaded (or alternately protect entries with a sync.Mutex if you want
concurrent adds).
- Around line 110-117: The loop launching parallel guardrails should use a
derived cancellable context so other goroutines stop when one fails: before the
for loop create newCtx, cancel := context.WithCancel(ctx) and defer cancel();
pass newCtx (not the outer ctx) into g.Process; after writing results[idx] if
err != nil call cancel() to propagate cancellation; keep wg and result storage
as-is but ensure you reference the loop variables (group, g.Process, results,
wg, ctx/result) correctly to avoid closure capture issues.
- Around line 103-129: runGroupParallel currently runs guardrails in parallel
and then overwrites outputs so only the last guardrail's modifications survive;
to fix, make runGroupParallel apply guardrails sequentially in registration
order by calling each entry.guardrail.Process(ctx, currentMsgs) one after
another (feeding the previous output into the next) instead of launching
goroutines, check and immediately return on any error from Guardrail.Process,
and update uses of results/result to reflect the sequential flow (use a single
currentMsgs variable rather than collecting concurrent results). Ensure you
reference runGroupParallel, entry.guardrail (or Guardrail), Guardrail.Process,
the msgs/current variable, and the result handling when implementing this
change.

In `@internal/guardrails/provider_test.go`:
- Around line 60-89: In TestGuardedProvider_ChatCompletion_AppliesGuardrails
(and the other test sites calling NewSystemPromptGuardrail), stop discarding the
returned error: capture the two returns (g, err) from NewSystemPromptGuardrail
and call t.Fatal(err) (or t.Fatalf with context) if err != nil so the test fails
fast on unexpected validation errors; update each occurrence that currently uses
g, _ to use g, err and handle the error accordingly.

In `@internal/guardrails/provider.go`:
- Around line 134-147: The current applyMessagesToResponses function merges
multiple system-role Messages into a single ResponsesRequest.Instructions string
with "\n", which is lossy; update applyMessagesToResponses to join system
messages using an explicit, easily-parsable separator (e.g. "\n\n---\n\n")
instead of a single newline and add a short code comment above
applyMessagesToResponses documenting this concatenation semantic so callers know
multiple guardrail outputs are preserved and distinguishable; reference the
applyMessagesToResponses function and the ResponsesRequest.Instructions field
when making the change.

Comment on lines +54 to +69
// GuardrailRuleConfig defines a single guardrail instance.
type GuardrailRuleConfig struct {
// Name is a unique identifier for this guardrail instance (used in logs and errors)
Name string `yaml:"name"`

// Type selects the guardrail implementation: "system_prompt"
Type string `yaml:"type"`

// Order controls execution ordering relative to other guardrails.
// Guardrails with the same order run in parallel; different orders run sequentially.
// Default: 0
Order int `yaml:"order"`

// SystemPrompt holds settings when Type is "system_prompt"
SystemPrompt SystemPromptSettings `yaml:"system_prompt"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Flat union config pattern is acceptable for now but consider documenting extensibility.

GuardrailRuleConfig embeds SystemPrompt settings directly regardless of Type. This works well with a single type but will accumulate dead fields as more guardrail types are added. Consider adding a brief comment noting this trade-off.

🤖 Prompt for AI Agents
In `@config/config.go` around lines 54 - 69, The GuardrailRuleConfig currently
embeds SystemPromptSettings directly (SystemPrompt field) which is a flat-union
pattern that will accumulate dead fields as new guardrail types are added;
update the comment above the GuardrailRuleConfig struct to explicitly document
this trade-off and intended extensibility (mention that SystemPrompt is only
used when Type == "system_prompt", that other types may add their own settings
or a true tagged union could be introduced later) so future maintainers
understand why the flat pattern was chosen and how to evolve it.

Comment on lines +30 to +34
// Add appends a guardrail with the given execution order.
// Guardrails with the same order run in parallel; different orders run sequentially.
func (p *Pipeline) Add(g Guardrail, order int) {
p.entries = append(p.entries, entry{guardrail: g, order: order})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Pipeline.Add is not safe for concurrent use.

If Add could be called from multiple goroutines during setup (e.g., dynamically loaded config), the slice append is racy. This is likely fine if the pipeline is built once at startup, but a brief doc comment clarifying that Add must be called before Process would prevent misuse.

Clarify concurrency contract
 // Add appends a guardrail with the given execution order.
 // Guardrails with the same order run in parallel; different orders run sequentially.
+// Add must not be called concurrently or after Process has been called.
 func (p *Pipeline) Add(g Guardrail, order int) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Add appends a guardrail with the given execution order.
// Guardrails with the same order run in parallel; different orders run sequentially.
func (p *Pipeline) Add(g Guardrail, order int) {
p.entries = append(p.entries, entry{guardrail: g, order: order})
}
// Add appends a guardrail with the given execution order.
// Guardrails with the same order run in parallel; different orders run sequentially.
// Add must not be called concurrently or after Process has been called.
func (p *Pipeline) Add(g Guardrail, order int) {
p.entries = append(p.entries, entry{guardrail: g, order: order})
}
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 30 - 34, Pipeline.Add is not
safe for concurrent use; update the API contract by adding a clear doc comment
to the Pipeline type and/or the Add method stating that Add must be called only
during setup (before calling Pipeline.Process) and is not goroutine-safe because
it appends to the entries slice. Reference the Pipeline struct, the Add(g
Guardrail, order int) method, and the entries field in the comment so callers
know to build the pipeline single-threaded (or alternately protect entries with
a sync.Mutex if you want concurrent adds).

Comment on lines +103 to +129
// runGroupParallel runs all guardrails in a group concurrently on the same input.
// If any returns an error, the group fails. Modifications are applied
// in registration order (slice order) after all complete.
func runGroupParallel(ctx context.Context, group []entry, msgs []Message) ([]Message, error) {
results := make([]result, len(group))
var wg sync.WaitGroup

for i, e := range group {
wg.Add(1)
go func(idx int, g Guardrail) {
defer wg.Done()
modified, err := g.Process(ctx, msgs)
results[idx] = result{msgs: modified, err: err}
}(i, e.guardrail)
}

wg.Wait()

// Check for errors and take last successful modification (registration order)
current := msgs
for i, r := range results {
if r.err != nil {
return nil, fmt.Errorf("guardrail %q: %w", group[i].guardrail.Name(), r.err)
}
current = r.msgs
}
return current, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Parallel group execution silently discards all modifications except the last guardrail's.

All guardrails in a parallel group receive the same input msgs, run concurrently, and then at lines 122-128 the results are iterated in order — but each current = r.msgs simply overwrites the previous result. Only the last guardrail's output survives. Earlier guardrails' modifications are silently discarded.

The doc comment at lines 104-105 says "Modifications are applied in registration order" which implies a merge/chain, but the actual behavior is "last writer wins." This is a meaningful distinction that could cause silent loss of guardrail effects when users configure multiple guardrails at the same order.

Consider one of:

  1. Chain sequentially within the group — feed each guardrail's output as the next one's input (true "apply in order").
  2. Document the "last wins" semantic explicitly and warn users that only validation/error guardrails are useful for non-last entries in a group.
  3. Implement a merge strategy that combines modifications from all parallel guardrails.
Option 1: Chain results sequentially within the group (preserves all modifications)
 func runGroupParallel(ctx context.Context, group []entry, msgs []Message) ([]Message, error) {
 	results := make([]result, len(group))
 	var wg sync.WaitGroup
 
 	for i, e := range group {
 		wg.Add(1)
 		go func(idx int, g Guardrail) {
 			defer wg.Done()
 			modified, err := g.Process(ctx, msgs)
 			results[idx] = result{msgs: modified, err: err}
 		}(i, e.guardrail)
 	}
 
 	wg.Wait()
 
-	// Check for errors and take last successful modification (registration order)
-	current := msgs
+	// Check for errors first
 	for i, r := range results {
 		if r.err != nil {
 			return nil, fmt.Errorf("guardrail %q: %w", group[i].guardrail.Name(), r.err)
 		}
-		current = r.msgs
 	}
+
+	// Chain results: feed each guardrail's output as next input
+	current := msgs
+	for _, r := range results {
+		current = r.msgs
+	}
 	return current, nil
 }

Note: Even Option 1 still has the problem that each guardrail ran on the original msgs, not on the previous guardrail's output. If true chaining is desired, the goroutines should run sequentially or a different merge strategy is needed. The current parallel design fundamentally means guardrails don't see each other's changes.

🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 103 - 129, runGroupParallel
currently runs guardrails in parallel and then overwrites outputs so only the
last guardrail's modifications survive; to fix, make runGroupParallel apply
guardrails sequentially in registration order by calling each
entry.guardrail.Process(ctx, currentMsgs) one after another (feeding the
previous output into the next) instead of launching goroutines, check and
immediately return on any error from Guardrail.Process, and update uses of
results/result to reflect the sequential flow (use a single currentMsgs variable
rather than collecting concurrent results). Ensure you reference
runGroupParallel, entry.guardrail (or Guardrail), Guardrail.Process, the
msgs/current variable, and the result handling when implementing this change.

Comment on lines +110 to +117
for i, e := range group {
wg.Add(1)
go func(idx int, g Guardrail) {
defer wg.Done()
modified, err := g.Process(ctx, msgs)
results[idx] = result{msgs: modified, err: err}
}(i, e.guardrail)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

No cancellation propagation on parallel guardrail failure.

When one goroutine's guardrail returns an error, the other goroutines continue running to completion (they all share the same ctx but nothing cancels it). For long-running or external guardrails, consider deriving a cancellable context and cancelling it on the first error to avoid wasted work.

Proposed fix using a derived context
 func runGroupParallel(ctx context.Context, group []entry, msgs []Message) ([]Message, error) {
+	ctx, cancel := context.WithCancel(ctx)
+	defer cancel()
+
 	results := make([]result, len(group))
 	var wg sync.WaitGroup
 
 	for i, e := range group {
 		wg.Add(1)
 		go func(idx int, g Guardrail) {
 			defer wg.Done()
 			modified, err := g.Process(ctx, msgs)
+			if err != nil {
+				cancel()
+			}
 			results[idx] = result{msgs: modified, err: err}
 		}(i, e.guardrail)
 	}
🤖 Prompt for AI Agents
In `@internal/guardrails/pipeline.go` around lines 110 - 117, The loop launching
parallel guardrails should use a derived cancellable context so other goroutines
stop when one fails: before the for loop create newCtx, cancel :=
context.WithCancel(ctx) and defer cancel(); pass newCtx (not the outer ctx) into
g.Process; after writing results[idx] if err != nil call cancel() to propagate
cancellation; keep wg and result storage as-is but ensure you reference the loop
variables (group, g.Process, results, wg, ctx/result) correctly to avoid closure
capture issues.

Comment on lines +60 to +89
func TestGuardedProvider_ChatCompletion_AppliesGuardrails(t *testing.T) {
inner := &mockRoutableProvider{}
pipeline := NewPipeline()

g, _ := NewSystemPromptGuardrail("test", SystemPromptInject, "guardrail system")
pipeline.Add(g, 0)

guarded := NewGuardedProvider(inner, pipeline)

req := &core.ChatRequest{
Model: "gpt-4",
Messages: []core.Message{{Role: "user", Content: "hello"}},
}

_, err := guarded.ChatCompletion(context.Background(), req)
if err != nil {
t.Fatal(err)
}

// Verify the inner provider received the modified request
if inner.chatReq == nil {
t.Fatal("inner provider was not called")
}
if len(inner.chatReq.Messages) != 2 {
t.Fatalf("expected 2 messages, got %d", len(inner.chatReq.Messages))
}
if inner.chatReq.Messages[0].Role != "system" || inner.chatReq.Messages[0].Content != "guardrail system" {
t.Errorf("expected injected system message, got %+v", inner.chatReq.Messages[0])
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Minor: Consider checking the error from NewSystemPromptGuardrail.

Throughout the test file (lines 64, 95, 123, 168, 189, 215, 265, 290, 315), the error return from NewSystemPromptGuardrail is silently discarded with _. While the inputs are known-valid constants, a t.Fatal on error would make tests more robust against future refactors that might change validation rules.

Example fix (apply similarly to other call sites)
-	g, _ := NewSystemPromptGuardrail("test", SystemPromptInject, "guardrail system")
+	g, err := NewSystemPromptGuardrail("test", SystemPromptInject, "guardrail system")
+	if err != nil {
+		t.Fatal(err)
+	}
🤖 Prompt for AI Agents
In `@internal/guardrails/provider_test.go` around lines 60 - 89, In
TestGuardedProvider_ChatCompletion_AppliesGuardrails (and the other test sites
calling NewSystemPromptGuardrail), stop discarding the returned error: capture
the two returns (g, err) from NewSystemPromptGuardrail and call t.Fatal(err) (or
t.Fatalf with context) if err != nil so the test fails fast on unexpected
validation errors; update each occurrence that currently uses g, _ to use g, err
and handle the error accordingly.

Comment on lines +134 to +147
func applyMessagesToResponses(req *core.ResponsesRequest, msgs []Message) *core.ResponsesRequest {
result := *req
var instructions string
for _, m := range msgs {
if m.Role == "system" {
if instructions != "" {
instructions += "\n"
}
instructions += m.Content
}
}
result.Instructions = instructions
return &result
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Note: multiple system messages in guardrail output are concatenated with newlines for Responses.

applyMessagesToResponses joins all system-role messages with \n into a single Instructions string. This works but is a lossy operation — if a guardrail pipeline produces multiple system messages (e.g., from multiple guardrails in different groups), they're merged with no delimiter distinction. Ensure this concatenation semantic is intentional and documented.

🤖 Prompt for AI Agents
In `@internal/guardrails/provider.go` around lines 134 - 147, The current
applyMessagesToResponses function merges multiple system-role Messages into a
single ResponsesRequest.Instructions string with "\n", which is lossy; update
applyMessagesToResponses to join system messages using an explicit,
easily-parsable separator (e.g. "\n\n---\n\n") instead of a single newline and
add a short code comment above applyMessagesToResponses documenting this
concatenation semantic so callers know multiple guardrail outputs are preserved
and distinguishable; reference the applyMessagesToResponses function and the
ResponsesRequest.Instructions field when making the change.

Keep ASCII as HTML comments for readability when charts don't render.

https://claude.ai/code/session_011U6sHbthHcwY68FZhAEFVH
@SantiagoDePolonia SantiagoDePolonia merged commit 3cc2952 into main Feb 15, 2026
11 checks passed
@SantiagoDePolonia SantiagoDePolonia deleted the claude/add-guardrails-docs-nu1RR branch March 22, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants