Skip to content

fix(release): adapt gpt-5 chat params and batch aliases#199

Merged
SantiagoDePolonia merged 5 commits intomainfrom
fix/gpt5-max-tokens-batch-aliases
Apr 1, 2026
Merged

fix(release): adapt gpt-5 chat params and batch aliases#199
SantiagoDePolonia merged 5 commits intomainfrom
fix/gpt5-max-tokens-batch-aliases

Conversation

@SantiagoDePolonia
Copy link
Copy Markdown
Contributor

@SantiagoDePolonia SantiagoDePolonia commented Apr 1, 2026

Summary by CodeRabbit

  • New Features

    • Exact-match response caching and expanded GPT-5 reasoning-chat handling.
  • Documentation

    • Added cache feature docs, navigation entry, and quickstart link; clarified cache behavior, headers, analytics, config, keying, and bypass rules.
  • Bug Fixes

    • Preserve explicit provider selectors and reject unsupported provider/model combinations.
  • Tests

    • Added unit, handler, and expanded end-to-end scenarios covering auth, caching, and workflow/key behaviors.

@SantiagoDePolonia SantiagoDePolonia self-assigned this Apr 1, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

Warning

Rate limit exceeded

@SantiagoDePolonia has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 52 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 6 minutes and 52 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 92785516-a89d-4cdf-b51e-e074d49612b9

📥 Commits

Reviewing files that changed from the base of the PR and between 0ff9cb3 and 888e3fd.

📒 Files selected for processing (10)
  • internal/admin/dashboard/static/css/dashboard.css
  • internal/admin/dashboard/static/js/modules/audit-list.js
  • internal/admin/dashboard/static/js/modules/audit-list.test.js
  • internal/admin/dashboard/static/js/modules/auth-keys.js
  • internal/admin/dashboard/static/js/modules/auth-keys.test.js
  • internal/admin/dashboard/static/js/modules/clipboard.js
  • internal/admin/dashboard/static/js/modules/dashboard-layout.test.js
  • internal/admin/dashboard/templates/audit-pane.html
  • internal/admin/dashboard/templates/index.html
  • internal/admin/dashboard/templates/layout.html
📝 Walkthrough

Walkthrough

Adds documentation for exact-match response caching (new feature page, nav entry, quickstart link), tweaks alias resolution to normalize and respect explicit providers, extends OpenAI provider handling to detect GPT‑5 and unify reasoning-chat parameter adaptation, and adds related tests and expanded E2E scenarios.

Changes

Cohort / File(s) Summary
Cache docs & navigation
docs/advanced/configuration.mdx, docs/docs.json, docs/features/cache.mdx, docs/getting-started/quickstart.mdx
Add a Cache feature page and nav entry, add a config-table tip linking to cache docs. Documents exact-cache scope, key computation, Redis/env config, Cache-Control bypass, and cache analytics/usage logging.
Alias resolution logic
internal/aliases/service.go
Normalize requested selector first; if selector has an explicit provider, skip alias lookup and return the normalized selector unchanged; otherwise attempt alias resolution.
Alias-related tests & batch handling
internal/aliases/batch_preparer_test.go, internal/aliases/service_test.go, internal/server/handlers_test.go
New tests asserting rejection when an explicit provider conflicts with an alias target and that ResolveModel preserves requested selector when concrete selector unavailable.
OpenAI provider updates & tests
internal/providers/openai/openai.go, internal/providers/openai/openai_test.go
Add GPT‑5 detection, broaden reasoning-chat predicate, replace o‑series adapter with unified reasoning adapter that remaps max_tokensmax_completion_tokens, drops temperature, and preserve unknown top-level fields; add unit/streaming tests for GPT‑5 behavior.
E2E scenarios
tests/e2e/release-e2e-scenarios.md
Expand release E2E scenarios (S63–S79) with an auth-enabled runtime block exercising managed API keys, scoped workflows, cache warm/read and analytics validation, negatives, and cleanup.
Misc tests
internal/aliases/..., internal/server/...
Multiple new unit/integration tests across internal packages covering alias/provider interactions and batch input validation; check mocks/expectations align with updated resolve behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • feat(usage): add cache analytics support #196 — Implements response-cache recording, cache analytics hooks, and cache-overview endpoints; closely related to the cache docs, analytics mentions, and cache-related tests added here.

Poem

🐇 I hopped through docs and tests with glee,

Cache keys hidden under a tree,
GPT‑5 learned reasoning's song,
Aliases stayed tidy all along,
Tiny paws, big CI spree — hooray! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.05% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: GPT-5 parameter adaptation and batch alias handling improvements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/gpt5-max-tokens-batch-aliases

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines -125 to -130
if !requested.ExplicitProvider {
if resolution, ok := s.resolveAlias(requested.Model); ok {
return resolution, true, nil
}
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai what are the consequences of removing it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and display the Normalize method on RequestedModelSelector
ast-grep --pattern 'func ($_ RequestedModelSelector) Normalize() $$$'

# Find normalizeName function in the aliases package
ast-grep --pattern 'func normalizeName($_) string {
  $$$
}'

# Also look at how normalizeAlias is implemented
ast-grep --pattern 'func normalizeAlias($_) $$$'

Length of output: 1538


🏁 Script executed:

#!/bin/bash
# Find ParseModelSelector implementation
ast-grep --pattern 'func ParseModelSelector($$$) $$$'

Length of output: 1515


@SantiagoDePolonia Here's a breakdown of the consequences of removing that block:


What was removed

// OLD — ran BEFORE Normalize()
if !requested.ExplicitProvider {
    if resolution, ok := s.resolveAlias(requested.Model); ok {
        return resolution, true, nil   // early return, skipped normalization entirely
    }
}

Consequences

1. Redundant double alias lookup eliminated (main effect)

In the old code, for a non-explicit-provider request that does not match an alias, resolveAlias was called twice: once in the removed early block, and again at the bottom of the function (which still exists). Removing the early block means the lookup happens exactly once. This is purely a correctness / clarity improvement.

2. Normalization now always runs before alias resolution

In the old code, if an alias was found early, Normalize() was never called. In the new code, Normalize() (which calls ParseModelSelector) runs first. Theoretically this could surface errors that were previously swallowed, but in practice it's a no-op: the only error ParseModelSelector can return is "model is empty", and resolveAlias already guards against that (via normalizeNamestrings.TrimSpace, returning false for empty names). So there's no reachable input where the old code succeeded but the new code returns an error.

3. Resolution.Requested field is unaffected

Neither old nor new code overrides resolution.Requested in the non-explicit-provider path (unlike the ExplicitProvider path which does resolution.Requested = selector). Both paths rely on resolveAlias to populate it as core.ModelSelector{Model: normalizedName}, so the returned value is identical.

4. Provider prefix stripping is unaffected

ParseModelSelector auto-extracts a provider prefix (e.g., "openai/gpt-4"model="gpt-4", provider="openai"), but resolveAlias is still called with requested.Model (the original string) in both old and new code — not with selector.Model — so alias lookup behavior is unchanged.


Summary

The removal is functionally safe. The only observable change is eliminating the redundant duplicate alias lookup for non-explicit-provider requests, making the code path consistent and easier to follow.


🧠 Learnings used
Learnt from: SantiagoDePolonia
Repo: ENTERPILOT/GOModel PR: 33
File: internal/auditlog/auditlog_test.go:272-279
Timestamp: 2026-01-15T23:07:26.646Z
Learning: Go code guideline: Avoid using interface{} and map[string]interface{} for API request/response payload types. This should be restricted to public API surfaces and data payloads. It is acceptable to use interface{} in internal infrastructure packages (e.g., storage abstractions) only to break import cycles, and only when type assertions occur at initialization (not on every request) to minimize runtime type checks and improve readability. Apply this rule broadly to files under internal/ that deal with payload-like types, but not to internal components that are pure infrastructure or have runtime-per-request type assertions.

Learnt from: SantiagoDePolonia
Repo: ENTERPILOT/GOModel PR: 33
File: internal/auditlog/factory.go:112-143
Timestamp: 2026-01-15T23:07:37.652Z
Learning: Guideline: Do not use interface{} or map[string]interface{} for API request/response payload types. Prefer strongly-typed structs for API payload definitions to improve type safety, serialization, and documentation. Allow interface{} only in internal infrastructure code paths where pragmatic flexibility is necessary (e.g., to avoid import cycles or to handle highly dynamic internal contracts). In internal/auditlog/factory.go and similar non-API implementation files, applying this restriction is optional and should be evaluated on a case-by-case basis based on whether the type remains internal and does not define API boundary shapes.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/aliases/service_test.go`:
- Around line 238-265: The test
TestServiceResolveAliasWithExplicitProviderFallbackWhenConcreteSelectorMissing
assumes ResolveModel will silently override an explicit provider; instead,
change the assertions so the resolver leaves an explicit provider selector
unchanged: call
service.ResolveModel(core.NewRequestedModelSelector("smart","openai")) and
assert that changed is false and that selector.QualifiedModel() still reflects
the original explicit provider/name (e.g., "openai/smart"), rather than
expecting "anthropic/claude-3-7-sonnet". Update expectations in the test (not
resolver code) and keep NewService, Refresh, and ResolveModel calls as-is.

In `@internal/aliases/service.go`:
- Around line 130-137: When requested.ExplicitProvider is true and
s.catalog.Supports(selector.QualifiedModel()) is false, do NOT call
s.resolveAlias(requested.Model); instead keep and return the normalized selector
unchanged so higher validation can reject provider/model mismatches. Update the
explicit-provider branch in the ResolveModel path: remove the
s.resolveAlias(requested.Model) fallback and return Resolution{Requested:
selector, Resolved: selector}, false, nil (preserving selector and not resolving
aliases) so explicit provider requests are honored.

In `@internal/providers/openai/openai_test.go`:
- Around line 1835-1894: The test
TestStreamChatCompletion_GPT5Model_AdaptsParameters is missing an assertion that
streaming GPT‑5 requests do not include a temperature field; update the server
handler in that test (the httptest http.HandlerFunc that inspects the parsed
request `raw`) to assert that `temperature` is not present (similar to the
existing `max_tokens` check) so the test fails if provider.StreamChatCompletion
or the core.ChatRequest accidentally forwards Temperature for streaming gpt‑5
models.

In `@tests/e2e/release-e2e-scenarios.md`:
- Line 674: The markdown headings (e.g., "S63 Auth-enabled dashboard runtime
config" and the other flagged headings listed) are missing a required blank line
underneath which triggers markdownlint MD022; to fix, insert a single blank line
immediately after each heading line so the heading is separated from the
following content, and apply this same blank-line pattern to all other flagged
headings (lines referenced in the comment) throughout the document to satisfy
MD022.
- Around line 693-695: The jq command is currently outputting the secret field
`value`, risking secret leakage; update the pipeline that prints
`"$AUTH_KEY_JSON"` so the jq filter omits `value` and only emits non-secret
fields (e.g., `id`, `name`, `user_path`, `active`, `redacted_value`), ensuring
any logging or CI output never includes the raw managed API key value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9ce04750-9aff-46b6-b7d2-157d8bdc2316

📥 Commits

Reviewing files that changed from the base of the PR and between ca63f7e and bc04271.

📒 Files selected for processing (11)
  • docs/advanced/configuration.mdx
  • docs/docs.json
  • docs/features/cache.mdx
  • docs/getting-started/quickstart.mdx
  • internal/aliases/batch_preparer_test.go
  • internal/aliases/service.go
  • internal/aliases/service_test.go
  • internal/providers/openai/openai.go
  • internal/providers/openai/openai_test.go
  • internal/server/handlers_test.go
  • tests/e2e/release-e2e-scenarios.md

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
internal/providers/openai/openai_test.go (1)

1878-1882: ⚠️ Potential issue | 🟡 Minor

Request is missing Temperature field, making the temperature assertion ineffective.

The server handler correctly asserts that temperature should not be present (lines 1852-1854), but the request never sets Temperature. This means the assertion is dead code that cannot catch a regression where streaming GPT-5 requests accidentally forward temperature.

Compare with TestChatCompletion_GPT5Model_AdaptsParameters (lines 1608-1613) which correctly sets both fields.

🔧 Proposed fix
 func TestStreamChatCompletion_GPT5Model_AdaptsParameters(t *testing.T) {
 	maxTokens := 2000
+	temp := 0.7
 
 	server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
 ...
 	req := &core.ChatRequest{
-		Model:     "gpt-5-nano",
-		Messages:  []core.Message{{Role: "user", Content: "Hello"}},
-		MaxTokens: &maxTokens,
+		Model:       "gpt-5-nano",
+		Messages:    []core.Message{{Role: "user", Content: "Hello"}},
+		MaxTokens:   &maxTokens,
+		Temperature: &temp,
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/providers/openai/openai_test.go` around lines 1878 - 1882, The test
builds a core.ChatRequest (req) without setting the Temperature field so the
server-side assertion that temperature should be absent is never exercised;
update the test that constructs req in internal/providers/openai/openai_test.go
to set the Temperature pointer (e.g., a local temperature variable and assign
&temperature to req.Temperature) just like in
TestChatCompletion_GPT5Model_AdaptsParameters so the handler's temperature
assertion becomes effective and will catch regressions where GPT-5 streaming
requests forward temperature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/release-e2e-scenarios.md`:
- Around line 755-756: The script writes sensitive managed API key files
($AUTH_KEY_JSON -> $QA_AUTH_KEY_JSON and extracted value ->
$QA_AUTH_KEY_VALUE_FILE) to /tmp without restricting permissions or guaranteed
cleanup; change the write flow to create the files with secure permissions
(e.g., use a temp file or set umask/chmod to owner-only such as 600 immediately
after creation) and ensure both $QA_AUTH_KEY_JSON and $QA_AUTH_KEY_VALUE_FILE
are removed in the cleanup path; update any teardown/trap logic to always unlink
these two variables to avoid leaving secrets on disk.
- Around line 35-37: Add fail-fast shell guards before executing the
auth-enabled block: enable strict mode (set -euo pipefail) and verify sourcing
.env succeeded (or that required env vars are set) so a failed `source .env` or
subsequent API call cannot silently cascade; update the snippet containing `set
-a` / `source .env` / `set +a` and replicate the same guards for the other
blocks referenced (the blocks around lines 751-783 and 917-945) to ensure early
exit on errors and missing required variables.
- Around line 734-946: The PR adds new auth-enabled runtime and cache behaviors
(managed API keys, scoped execution plans, cache analytics, and new audit
fields/headers) but CLAUDE.md was not updated; update CLAUDE.md to document the
new admin endpoints and payloads (POST /admin/api/v1/auth-keys, POST
/admin/api/v1/execution-plans, deactivate endpoints), the managed API key
lifecycle and one-time secret handling, the scoped workflow semantics
(scope_user_path, effective_features including cache:false), the
X-GoModel-User-Path header handling and precedence when using managed keys,
expected audit log fields (auth_method, auth_key_id, user_path,
execution_plan_version_id, cache_type), and the cache analytics/usage endpoints
and query params shown in the tests so readers can reproduce S63–S79 flows.

---

Duplicate comments:
In `@internal/providers/openai/openai_test.go`:
- Around line 1878-1882: The test builds a core.ChatRequest (req) without
setting the Temperature field so the server-side assertion that temperature
should be absent is never exercised; update the test that constructs req in
internal/providers/openai/openai_test.go to set the Temperature pointer (e.g., a
local temperature variable and assign &temperature to req.Temperature) just like
in TestChatCompletion_GPT5Model_AdaptsParameters so the handler's temperature
assertion becomes effective and will catch regressions where GPT-5 streaming
requests forward temperature.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c1e60900-3f2d-4472-a122-c80af70cb3ef

📥 Commits

Reviewing files that changed from the base of the PR and between bc04271 and 1922333.

📒 Files selected for processing (6)
  • internal/aliases/batch_preparer_test.go
  • internal/aliases/service.go
  • internal/aliases/service_test.go
  • internal/providers/openai/openai_test.go
  • internal/server/handlers_test.go
  • tests/e2e/release-e2e-scenarios.md

Comment on lines +734 to +946
## 12. Authenticated runtime features

### S63 Auth-enabled dashboard runtime config

Reads the allowlisted runtime flags for the live auth-enabled gateway.

```bash
curl -sS "$AUTH_BASE_URL/admin/api/v1/dashboard/config" \
-H "$ADMIN_AUTH_HEADER" \
| jq '.'
```

### S64 Create managed API key

Creates one managed API key scoped to a release-specific user path and stores the one-time secret under `/tmp`.

```bash
AUTH_KEY_JSON=$(curl -sS -X POST "$AUTH_BASE_URL/admin/api/v1/auth-keys" \
-H "$ADMIN_AUTH_HEADER" \
-H 'Content-Type: application/json' \
-d "{\"name\":\"$QA_AUTH_KEY_NAME\",\"description\":\"Release e2e managed key\",\"user_path\":\"$QA_USER_PATH\"}")
printf '%s\n' "$AUTH_KEY_JSON" > "$QA_AUTH_KEY_JSON"
printf '%s\n' "$AUTH_KEY_JSON" | jq -r '.value' > "$QA_AUTH_KEY_VALUE_FILE"
printf '%s\n' "$AUTH_KEY_JSON" \
| jq '{id,name,user_path,active,redacted_value}'
```

### S65 Verify managed API key list

Checks that the newly issued managed API key is visible and active.

```bash
curl -sS "$AUTH_BASE_URL/admin/api/v1/auth-keys" \
-H "$ADMIN_AUTH_HEADER" \
| jq ".[] | select(.name==\"$QA_AUTH_KEY_NAME\") | {id,name,user_path,active,expires_at,redacted_value}"
```

### S66 Create user-path-scoped workflow with cache disabled

Creates a scoped workflow for `openai/gpt-4.1-nano` that disables cache for the managed-key user path.

```bash
WORKFLOW_JSON=$(curl -sS -X POST "$AUTH_BASE_URL/admin/api/v1/execution-plans" \
-H "$ADMIN_AUTH_HEADER" \
-H 'Content-Type: application/json' \
-d "{\"scope_provider\":\"openai\",\"scope_model\":\"gpt-4.1-nano\",\"scope_user_path\":\"$QA_USER_PATH\",\"name\":\"$QA_WORKFLOW_NAME\",\"description\":\"Disable cache for managed-key release e2e scope\",\"plan_payload\":{\"schema_version\":1,\"features\":{\"cache\":false,\"audit\":true,\"usage\":true,\"guardrails\":false,\"fallback\":false},\"guardrails\":[]}}")
printf '%s\n' "$WORKFLOW_JSON" > "$QA_WORKFLOW_JSON"
printf '%s\n' "$WORKFLOW_JSON" | jq -r '.id' > "$QA_WORKFLOW_ID_FILE"
printf '%s\n' "$WORKFLOW_JSON" \
| jq '{id,name,scope,plan_payload}'
```

### S67 Verify scoped workflow detail

Reads the created workflow back and confirms the normalized scope and effective feature projection.

```bash
WORKFLOW_ID=$(cat "$QA_WORKFLOW_ID_FILE")
curl -sS "$AUTH_BASE_URL/admin/api/v1/execution-plans/$WORKFLOW_ID" \
-H "$ADMIN_AUTH_HEADER" \
| jq '{id,name,scope,plan_payload,effective_features}'
```

### S68 Managed-key request through scoped workflow

Sends a request with the managed API key while also sending a conflicting `X-GoModel-User-Path` header.

```bash
API_KEY=$(cat "$QA_AUTH_KEY_VALUE_FILE")
curl -sS -D - "$AUTH_BASE_URL/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H 'Content-Type: application/json' \
-H "X-Request-ID: $QA_AUTH_REQ1" \
-H 'X-GoModel-User-Path: /team/should-be-overridden' \
-d '{"model":"openai/gpt-4.1-nano","messages":[{"role":"user","content":"Reply with exactly QA_AUTH_CACHE_OFF_OK"}],"max_tokens":16}' \
| sed -n '1,20p'
```

### S69 Repeated managed-key request should still bypass cache

Repeats the same request and expects another live provider response rather than `X-Cache: HIT`.

```bash
API_KEY=$(cat "$QA_AUTH_KEY_VALUE_FILE")
curl -sS -D - "$AUTH_BASE_URL/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H 'Content-Type: application/json' \
-H "X-Request-ID: $QA_AUTH_REQ2" \
-H 'X-GoModel-User-Path: /team/should-be-overridden' \
-d '{"model":"openai/gpt-4.1-nano","messages":[{"role":"user","content":"Reply with exactly QA_AUTH_CACHE_OFF_OK"}],"max_tokens":16}' \
| sed -n '1,20p'
```

### S70 Audit evidence for managed-key scoped workflow

Confirms that auth method, managed auth key ID, normalized user path, workflow ID, and no cache hit are all recorded together.

```bash
curl -sS "$AUTH_BASE_URL/admin/api/v1/audit/log?request_id=$QA_AUTH_REQ2&limit=5" \
-H "$ADMIN_AUTH_HEADER" \
| jq '{total,entries:(.entries|map({request_id,status_code,auth_method,auth_key_id,user_path,execution_plan_version_id,cache_type,answer:.data.response_body.choices[0].message.content}))}'
```

### S71 Global cache warm request with explicit user path

Warms the global cache-enabled workflow using the master key and a cache-specific user path.

```bash
curl -sS -D - "$AUTH_BASE_URL/v1/chat/completions" \
-H "$ADMIN_AUTH_HEADER" \
-H 'Content-Type: application/json' \
-H "X-Request-ID: $QA_CACHE_REQ1" \
-H "X-GoModel-User-Path: $QA_CACHE_USER_PATH" \
-d "{\"model\":\"openai/gpt-4.1-nano\",\"messages\":[{\"role\":\"user\",\"content\":\"Reply with exactly $QA_CACHE_REPLY\"}],\"max_tokens\":16}" \
| sed -n '1,20p'
```

### S72 Repeated global cache request should hit exact cache

Repeats the same request and expects `X-Cache: HIT (exact)`.

```bash
curl -sS -D - "$AUTH_BASE_URL/v1/chat/completions" \
-H "$ADMIN_AUTH_HEADER" \
-H 'Content-Type: application/json' \
-H "X-Request-ID: $QA_CACHE_REQ2" \
-H "X-GoModel-User-Path: $QA_CACHE_USER_PATH" \
-d "{\"model\":\"openai/gpt-4.1-nano\",\"messages\":[{\"role\":\"user\",\"content\":\"Reply with exactly $QA_CACHE_REPLY\"}],\"max_tokens\":16}" \
| sed -n '1,20p'
```

### S73 Cache overview filtered by user path

Checks cache analytics after the exact-cache hit using the same tracked user path.

```bash
sleep 6
curl -sS "$AUTH_BASE_URL/admin/api/v1/cache/overview?days=1&user_path=$QA_CACHE_USER_PATH" \
-H "$ADMIN_AUTH_HEADER" \
| jq '.'
```

### S74 Cached usage log filtered by user path

Reads cached-only usage entries for the same exact-hit request path.

```bash
curl -sS "$AUTH_BASE_URL/admin/api/v1/usage/log?days=1&user_path=$QA_CACHE_USER_PATH&cache_mode=cached&limit=5" \
-H "$ADMIN_AUTH_HEADER" \
| jq '{total,entries:(.entries|map({request_id,cache_type,model,provider,endpoint,user_path,total_tokens}))}'
```

### S75 Invalid managed API key user path (negative)

Verifies user-path validation for managed API key creation.

```bash
curl -sS -i -X POST "$AUTH_BASE_URL/admin/api/v1/auth-keys" \
-H "$ADMIN_AUTH_HEADER" \
-H 'Content-Type: application/json' \
-d '{"name":"qa-invalid-user-path","user_path":"/team/../alpha"}' \
| sed -n '1,20p'
```

### S76 Invalid workflow scope user path (negative)

Verifies user-path validation for workflow creation.

```bash
curl -sS -i -X POST "$AUTH_BASE_URL/admin/api/v1/execution-plans" \
-H "$ADMIN_AUTH_HEADER" \
-H 'Content-Type: application/json' \
-d '{"scope_provider":"openai","scope_model":"gpt-4.1-nano","scope_user_path":"/team/../alpha","name":"qa-invalid-workflow-path","plan_payload":{"schema_version":1,"features":{"cache":true,"audit":true,"usage":true,"guardrails":false},"guardrails":[]}}' \
| sed -n '1,24p'
```

## 13. Authenticated cleanup

### S77 Deactivate managed API key

Deactivates the managed key created for the auth-enabled release run.

```bash
AUTH_KEY_ID=$(curl -sS "$AUTH_BASE_URL/admin/api/v1/auth-keys" \
-H "$ADMIN_AUTH_HEADER" \
| jq -r ".[] | select(.name==\"$QA_AUTH_KEY_NAME\") | .id")
curl -sS -i -X POST "$AUTH_BASE_URL/admin/api/v1/auth-keys/$AUTH_KEY_ID/deactivate" \
-H "$ADMIN_AUTH_HEADER"
```

### S78 Deactivated managed API key is rejected

Confirms that the same managed key can no longer authenticate requests.

```bash
API_KEY=$(cat "$QA_AUTH_KEY_VALUE_FILE")
curl -sS -i "$AUTH_BASE_URL/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-H 'Content-Type: application/json' \
-H "X-Request-ID: $QA_DEACTIVATED_REQ" \
-d '{"model":"openai/gpt-4.1-nano","messages":[{"role":"user","content":"Reply with exactly QA_AUTH_DEACTIVATED"}],"max_tokens":16}' \
| sed -n '1,20p'
```

### S79 Deactivate scoped workflow

Deactivates the workflow created for the scoped managed-key release run.

```bash
WORKFLOW_ID=$(cat "$QA_WORKFLOW_ID_FILE")
curl -sS -i -X POST "$AUTH_BASE_URL/admin/api/v1/execution-plans/$WORKFLOW_ID/deactivate" \
-H "$ADMIN_AUTH_HEADER"
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Verify CLAUDE.md was updated for these new auth/cache/runtime behaviors.

This section adds/changes documented release validation behavior around auth-enabled runtime config, managed keys, scoped workflows, and cache analytics. Please confirm corresponding CLAUDE.md updates are included in this PR (or a linked follow-up).

Based on learnings: Update CLAUDE.md documentation whenever behavior, configuration, providers, supported commands, or public API contracts change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/release-e2e-scenarios.md` around lines 734 - 946, The PR adds new
auth-enabled runtime and cache behaviors (managed API keys, scoped execution
plans, cache analytics, and new audit fields/headers) but CLAUDE.md was not
updated; update CLAUDE.md to document the new admin endpoints and payloads (POST
/admin/api/v1/auth-keys, POST /admin/api/v1/execution-plans, deactivate
endpoints), the managed API key lifecycle and one-time secret handling, the
scoped workflow semantics (scope_user_path, effective_features including
cache:false), the X-GoModel-User-Path header handling and precedence when using
managed keys, expected audit log fields (auth_method, auth_key_id, user_path,
execution_plan_version_id, cache_type), and the cache analytics/usage endpoints
and query params shown in the tests so readers can reproduce S63–S79 flows.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
tests/e2e/release-e2e-scenarios.md (1)

35-37: ⚠️ Potential issue | 🟠 Major

Add strict shell guards before running auth-enabled scenarios.

Without fail-fast mode, a bad .env load or failed command can silently cascade through later scenarios.

Suggested fix
 set -a
 source .env
 set +a
+set -euo pipefail
+: "${GOMODEL_MASTER_KEY:?GOMODEL_MASTER_KEY is required}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/release-e2e-scenarios.md` around lines 35 - 37, Enable strict shell
failure handling before sourcing the environment and ensure .env is present: add
fail-fast and strict-unset/pipefail options before the existing set -a / source
.env / set +a sequence, and add an explicit existence/readability check for
".env" that prints a clear error and exits nonzero if the file is missing or
unreadable; keep the existing "set -a", "source .env", and "set +a" semantics
but surround them with the strict options and the pre-check.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/release-e2e-scenarios.md`:
- Around line 60-62: The cleanup_release_auth_artifacts function currently
removes only $QA_AUTH_KEY_JSON and $QA_AUTH_KEY_VALUE_FILE; update it to also
remove the workflow temp files by adding $QA_WORKFLOW_JSON and
$QA_WORKFLOW_ID_FILE to the rm -f list so those /tmp files are not left stale
(also apply the same addition wherever similar cleanup logic appears, e.g., the
other occurrences noted around the same tests).
- Around line 758-766: The script writes AUTH_KEY_JSON and then unconditionally
extracts .value into QA_AUTH_KEY_VALUE_FILE which can become "null" if creation
failed; update the logic around AUTH_KEY_JSON / QA_AUTH_KEY_VALUE_FILE / jq
usage to validate the response before persisting the key value: after obtaining
AUTH_KEY_JSON, verify the JSON indicates success (e.g., .value exists and is not
null/empty or check HTTP success) and only then write AUTH_KEY_JSON to
QA_AUTH_KEY_JSON and the extracted .value to QA_AUTH_KEY_VALUE_FILE; if
validation fails, emit an error and abort so downstream checks aren’t misled.

---

Duplicate comments:
In `@tests/e2e/release-e2e-scenarios.md`:
- Around line 35-37: Enable strict shell failure handling before sourcing the
environment and ensure .env is present: add fail-fast and strict-unset/pipefail
options before the existing set -a / source .env / set +a sequence, and add an
explicit existence/readability check for ".env" that prints a clear error and
exits nonzero if the file is missing or unreadable; keep the existing "set -a",
"source .env", and "set +a" semantics but surround them with the strict options
and the pre-check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3524a43c-ebf8-42a2-8eb6-3ff80e9adfa0

📥 Commits

Reviewing files that changed from the base of the PR and between 1922333 and 96c7a89.

📒 Files selected for processing (1)
  • tests/e2e/release-e2e-scenarios.md

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/e2e/release-e2e-scenarios.md`:
- Around line 941-945: The script currently may call the deactivate endpoint
with an empty AUTH_KEY_ID; update the command that extracts AUTH_KEY_ID (the jq
invocation used to populate AUTH_KEY_ID) to use strict mode (jq -er) so it fails
when no match is found, then add an explicit guard immediately after setting
AUTH_KEY_ID that checks if the variable is empty (e.g., test -z "$AUTH_KEY_ID")
and, if so, emit a clear error message and exit non-zero before reaching the
curl POST to /admin/api/v1/auth-keys/$AUTH_KEY_ID/deactivate; reference
AUTH_KEY_ID and the deactivate curl call when locating where to add the jq flag
and the early-fail check.
- Around line 800-806: The script writes WORKFLOW_JSON to QA_WORKFLOW_JSON and
then extracts the ID into QA_WORKFLOW_ID_FILE without validating the response;
update the sequence around WORKFLOW_JSON / jq -r '.id' so you parse and validate
the id (from WORKFLOW_JSON), ensure it is non-empty and not "null" (and
optionally check the HTTP response/status), and if validation fails emit a clear
error to stderr and exit non-zero instead of writing a bogus ID; locate and
change the block that sets WORKFLOW_JSON, the jq extraction to
QA_WORKFLOW_ID_FILE, and the follow-up logic that uses QA_WORKFLOW_ID_FILE to
implement this guard and fail-fast behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: de14fdf9-ded5-438c-9753-69c4e796e1c9

📥 Commits

Reviewing files that changed from the base of the PR and between 96c7a89 and 0ff9cb3.

📒 Files selected for processing (1)
  • tests/e2e/release-e2e-scenarios.md

@SantiagoDePolonia SantiagoDePolonia merged commit 0ebac81 into main Apr 1, 2026
15 checks passed
@SantiagoDePolonia SantiagoDePolonia deleted the fix/gpt5-max-tokens-batch-aliases branch April 1, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant