Skip to content

fix(ollama-proxy): require Bearer token on /api/tags#3353

Merged
ericksoa merged 2 commits into
mainfrom
fix/3338-ollama-proxy-require-auth-on-tags
May 12, 2026
Merged

fix(ollama-proxy): require Bearer token on /api/tags#3353
ericksoa merged 2 commits into
mainfrom
fix/3338-ollama-proxy-require-auth-on-tags

Conversation

@laitingsheng

@laitingsheng laitingsheng commented May 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Remove the unauthenticated bypass for /api/tags on the Ollama auth proxy: every endpoint (both the OpenAI-compat /v1/* namespace and the Ollama-native /api/* namespace) now requires the Bearer token, matching DevTest T5987914.

Related Issue

Fixes #3338

Changes

  • scripts/ollama-auth-proxy.js: drop the /api/tags exemption — every request is gated by the same Bearer-token check. Compare buffers (not JS string length) before crypto.timingSafeEqual so a non-ASCII Authorization header can't crash the proxy.
  • Retarget all internal liveness/reachability probes that used to do curl -sf .../api/tags: since an alive proxy now answers 401 to unauthenticated probes, they now accept any 3-digit HTTP code as proof of life (only 000 means dead).
    • src/lib/inference/local.ts container reachability probe (+ test fixture).
    • test/e2e-ollama-proxy.sh and test/e2e/test-ollama-auth-proxy-e2e.sh liveness probes + the post-kill "is it dead" check (flipped because 401 no longer distinguishes alive-vs-dead under -sf).
    • test/e2e/test-gpu-e2e.sh and test/e2e/test-gpu-double-onboard.sh proxy-port probes.
  • docs/inference/use-local-inference.md: document that every proxy endpoint requires the Bearer token and that internal reachability checks treat any HTTP response — including 401 — as alive.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

  • Documentation

    • Updated authentication requirements: Bearer token now required for all Ollama proxy endpoints.
  • Bug Fixes

    • Enhanced token validation with constant-time comparison for improved security.
    • Improved health check resilience to properly handle authentication-protected endpoints.
  • Tests

    • Updated test suites to reflect stricter authentication enforcement across all proxy endpoints.

Review Change Stack

Fixes #3338

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>
@copy-pr-bot

copy-pr-bot Bot commented May 11, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 97de6e0a-119e-4148-9e60-2fa9765e138d

📥 Commits

Reviewing files that changed from the base of the PR and between 81fc920 and a926f68.

📒 Files selected for processing (8)
  • docs/inference/use-local-inference.md
  • scripts/ollama-auth-proxy.js
  • src/lib/inference/local.test.ts
  • src/lib/inference/local.ts
  • test/e2e-ollama-proxy.sh
  • test/e2e/test-gpu-double-onboard.sh
  • test/e2e/test-gpu-e2e.sh
  • test/e2e/test-ollama-auth-proxy-e2e.sh

📝 Walkthrough

Walkthrough

This PR enforces Bearer token authentication on all Ollama auth proxy endpoints, including /api/tags which was previously unauthenticated. It updates reachability checks to capture HTTP status codes, allowing 401 responses to count as proof the proxy is alive, and aligns all E2E tests with the new security requirements.

Changes

Auth Proxy Token Enforcement and Test Alignment

Layer / File(s) Summary
Auth Proxy Request Authentication
scripts/ollama-auth-proxy.js
All requests now require valid Bearer token authentication. Token validation uses constant-time byte-comparison with length checking before calling crypto.timingSafeEqual. Health-check bypass removed; all endpoints enforce auth uniformly.
Container Reachability Check with HTTP Status Codes
src/lib/inference/local.ts, src/lib/inference/local.test.ts
Reachability validation for ollama-local now captures HTTP status codes (-o /dev/null -w %{http_code}) instead of curl exit status, allowing 401 responses to count as "proxy is running." Test expectations updated to match new curl argument signature.
Documentation of Authentication and Health Checks
docs/inference/use-local-inference.md
Updated to state /api/tags requires Bearer token and that proxy liveness checks treat any HTTP response (including 401) as proof the proxy is alive.
E2E Test Suite Updates for Bearer Token Requirement
test/e2e-ollama-proxy.sh, test/e2e/test-gpu-double-onboard.sh, test/e2e/test-gpu-e2e.sh, test/e2e/test-ollama-auth-proxy-e2e.sh
All tests updated to expect 401 for unauthenticated /api/tags requests, use HTTP status code capture for liveness/reachability checks, and accept any 3-digit HTTP response as proof the proxy is responding. Proxy recovery and reachability validation now relies on observable HTTP responses rather than curl exit codes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 The auth proxy now stands guard at the gate,
No endpoint shall pass without Bearer state!
401 blooms where health checks once roamed,
And status codes guide the travelers back home.
The tests applaud this security stride—
Protection and proof, side by side. 🔐

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title concisely and accurately describes the main change: requiring Bearer token authentication on the /api/tags endpoint, which directly addresses the security vulnerability documented in issue #3338.
Linked Issues check ✅ Passed The PR fully addresses the primary objective from #3338: all proxy endpoints now require Bearer token authentication by rejecting unauthenticated requests, with buffer-based token validation hardening and liveness probes updated to accept any HTTP response as proof of connectivity.
Out of Scope Changes check ✅ Passed All changes directly support the core security fix: authentication enforcement in the proxy, hardened token comparison, updated health-check logic in internal probes, E2E tests reflecting new auth behavior, and documentation updates. No unrelated changes detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3338-ollama-proxy-require-auth-on-tags

Comment @coderabbitai help to get the list of available commands and usage tips.

@wscurran wscurran added security Potential vulnerability, unsafe behavior, or access risk priority: high provider: ollama Ollama local model provider behavior labels May 11, 2026
@laitingsheng laitingsheng marked this pull request as ready for review May 12, 2026 00:07
@github-actions

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: None
Optional E2E: None

Workflow run

Full advisor summary

Pi Semantic E2E Advisor

Skipped: No Pi provider credential was available in this workflow environment

@laitingsheng laitingsheng added v0.0.40 platform: brev Affects Brev hosted development environments and removed v0.0.40 platform: brev Affects Brev hosted development environments provider: ollama Ollama local model provider behavior labels May 12, 2026

@ericksoa ericksoa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed current head a926f68. This closes the Ollama auth-proxy /api/* bypass by requiring the Bearer token uniformly, including GET /api/tags, while updating internal liveness probes to treat any HTTP response as reachability. The byte-length guarded timingSafeEqual path also avoids malformed/non-ASCII auth header crashes.\n\nValidated locally with shellcheck on the changed E2E scripts, npm run build:cli, npm run format:check, npm run checks, npx vitest run src/lib/inference/local.test.ts --testTimeout 60000, bash test/e2e-ollama-proxy.sh, and an ad hoc non-ASCII auth-header probe. Required GitHub checks are green.

@ericksoa ericksoa merged commit 6c4d359 into main May 12, 2026
32 of 33 checks passed
@wscurran wscurran added bug-fix PR fixes a bug or regression and removed priority: high labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression security Potential vulnerability, unsafe behavior, or access risk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Brev][Security] Ollama auth proxy on port 11435 leaves Ollama-native /api/* endpoints unauthenticated

3 participants