Skip to content

fix(gateway): reject no-auth tailscale exposure#87286

Merged
steipete merged 1 commit into
mainfrom
codex/fix-tailscale-serve-none-auth
May 27, 2026
Merged

fix(gateway): reject no-auth tailscale exposure#87286
steipete merged 1 commit into
mainfrom
codex/fix-tailscale-serve-none-auth

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

Fixes #50630.
Replaces the stale/conflicting runtime-only fix in #50631.

This rejects explicit gateway.auth.mode=none whenever the gateway is exposed through Tailscale Serve or Funnel. The policy is shared across config validation, install-token preflight, and runtime startup so setup paths fail before writing/accepting a service plan, while loopback-only no-auth remains valid.

Verification

Behavior addressed: gateway.auth.mode=none plus gateway.tailscale.mode=serve no longer reaches a running gateway exposed to the Tailnet; config and install preflight reject it before daemon startup, and runtime startup also fails closed.

Real environment tested: No live Tailnet device run. This patch is preflight/runtime validation; the verified behavior is config rejection and startup/install blocking.

Exact steps or command run after this patch:

  • node scripts/run-vitest.mjs src/config/config.gateway-tailscale-bind.test.ts src/gateway/server-runtime-config.test.ts src/commands/doctor-gateway-auth-token.test.ts
  • pnpm check:changed via Blacksmith Testbox tbx_01ksmpwc2fqfv5s9zpw3syvzzd, GitHub Actions run 26511438060
  • .agents/skills/autoreview/scripts/autoreview --mode local

Evidence after fix:

  • Config validation rejects Serve/Funnel plus explicit no-auth at gateway.auth.mode.
  • Gateway install token resolution returns an unavailable reason for Serve plus explicit no-auth before token generation/service planning.
  • Runtime config rejects Serve plus explicit no-auth even when invoked through overrides.

Observed result after fix: focused Vitest passed, Testbox check:changed exited 0, autoreview returned clean with no accepted/actionable findings.

What was not tested: real Tailscale Serve/Funnel network access from another Tailnet device.

Risk

Did user-visible behavior change? Yes.
Did config, environment, or migration behavior change? Yes.
Did security, auth, secrets, network, or tool execution behavior change? Yes.

Existing explicit no-auth Tailscale exposure now fails closed. Loopback-only gateway.auth.mode=none remains accepted.

@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 27, 2026, 9:06 AM ET / 13:06 UTC.

Summary
The branch adds a shared gateway Tailscale auth policy and applies it to config validation, service install token preflight, runtime startup, and focused regression tests.

PR surface: Source +61, Tests +81. Total +142 across 7 files.

Reproducibility: yes. source-reproducible: current main lets Tailscale Serve keep the gateway on loopback while auth.mode=none authorizes immediately, so the non-loopback auth guard never protects the Serve URL. I did not run a live Tailnet reproduction in this read-only review.

Review metrics: 1 noteworthy metric.

  • Config validation impact: 1 new rejection rule across 3 enforcement paths. The patch changes existing operator behavior by failing config validation, service install preflight, and runtime startup for explicit no-auth Tailscale exposure.

Merge readiness
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Add redacted terminal/live output showing the changed config validation, install preflight, or runtime startup rejection on the PR head.
  • Document in the PR body that private details such as IPs, tokens, phone numbers, and non-public endpoints were redacted from any proof.

Proof guidance:
Needs real behavior proof before merge: The PR body reports focused tests, Testbox, and autoreview, but no redacted live startup/install rejection output or real Tailnet proof; a contributor should add proof and update the PR body for automatic re-review.

Risk before merge

  • Existing explicit gateway.auth.mode=none plus Tailscale Serve/Funnel setups will now fail config validation, service install preflight, or runtime startup; this is likely the right security posture, but maintainers need to accept the fail-closed upgrade behavior.
  • Real behavior proof is currently test/CI-only: the PR body reports no live Tailnet device run and does not attach redacted startup/install rejection output for the changed path.
  • If this PR lands, the older open runtime-only branch at fix: Tailscale serve + auth.mode=none exposes gateway to full... #50631 should be closed or superseded so the linked security issue has one canonical fix path.

Maintainer options:

  1. Add real rejection proof before merge (recommended)
    Ask for redacted terminal/live output showing config validation, install preflight, or runtime startup rejecting explicit no-auth Tailscale exposure on the PR head.
  2. Accept the fail-closed upgrade break
    Maintainers can intentionally merge the compatibility break if the security fix outweighs existing no-auth Serve/Funnel setups and the release note calls out the startup failure mode.
  3. Pause for security owner signoff
    If the team wants live Tailnet proof or a staged migration policy first, hold this PR and keep the linked security issue open until that decision is made.

Next step before merge
The protected maintainer label plus intentional security/config fail-closed behavior require human merge judgment after proof review; there is no narrow code repair to queue from this review.

Security
Cleared: The dedicated security pass found the diff hardens the intended gateway auth boundary and does not add dependency, workflow, secret-handling, or supply-chain changes.

Review details

Best possible solution:

Land the shared fail-closed policy after maintainer security/compatibility signoff and redacted real rejection proof, then close #50630 and supersede #50631.

Do we have a high-confidence way to reproduce the issue?

Yes, source-reproducible: current main lets Tailscale Serve keep the gateway on loopback while auth.mode=none authorizes immediately, so the non-loopback auth guard never protects the Serve URL. I did not run a live Tailnet reproduction in this read-only review.

Is this the best way to solve the issue?

Yes, the proposed boundary is the maintainable fix shape: one shared policy applied to config validation, service install preflight, and runtime startup while preserving loopback-only no-auth. The remaining blocker is merge proof and maintainer acceptance of the fail-closed upgrade behavior, not a code finding.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 8e5183c60d4a.

Label changes

Label changes:

  • add P0: The linked bug concerns unauthenticated gateway exposure over Tailscale, which is a security-boundary failure for the gateway control plane.
  • add merge-risk: 🚨 compatibility: The PR intentionally makes previously accepted explicit no-auth Serve/Funnel configurations invalid during validation, install, and startup.
  • add merge-risk: 🚨 availability: Existing services using that configuration will stop at install or startup until an operator configures token, password, or trusted-proxy auth.
  • add rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body reports focused tests, Testbox, and autoreview, but no redacted live startup/install rejection output or real Tailnet proof; a contributor should add proof and update the PR body for automatic re-review.

Label justifications:

  • P0: The linked bug concerns unauthenticated gateway exposure over Tailscale, which is a security-boundary failure for the gateway control plane.
  • merge-risk: 🚨 compatibility: The PR intentionally makes previously accepted explicit no-auth Serve/Funnel configurations invalid during validation, install, and startup.
  • merge-risk: 🚨 availability: Existing services using that configuration will stop at install or startup until an operator configures token, password, or trusted-proxy auth.
  • rating: 🦪 silver shellfish: Overall readiness is 🦪 silver shellfish; proof is 🦪 silver shellfish and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body reports focused tests, Testbox, and autoreview, but no redacted live startup/install rejection output or real Tailnet proof; a contributor should add proof and update the PR body for automatic re-review.
Evidence reviewed

PR surface:

Source +61, Tests +81. Total +142 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 4 61 0 +61
Tests 3 82 1 +81
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 7 143 1 +142

What I checked:

  • Repository policy applied: Root AGENTS.md was read fully, and the scoped gateway AGENTS.md was read; gateway auth, config loading, startup checks, fallback behavior, and security boundaries are compatibility-sensitive review surfaces. (AGENTS.md:18, 8e5183c60d4a)
  • Current main root cause: On current main, runtime startup checks Funnel for password auth and Tailscale loopback binding, but there is no equivalent Serve-plus-no-auth guard before the non-loopback auth guard is skipped for loopback Serve. (src/gateway/server-runtime-config.ts:131, 8e5183c60d4a)
  • Current no-auth behavior: Current main authorizes auth.mode === "none" immediately with { ok: true, method: "none" }, which is the security boundary the PR is tightening for Tailscale exposure. (src/gateway/auth.ts:504, 8e5183c60d4a)
  • Dependency contract checked: Official Tailscale docs say Serve routes traffic from other tailnet devices to a local service and adds identity headers, while Funnel is for public internet exposure; that supports treating Serve/Funnel as remote exposure rather than local-only loopback. (tailscale.com)
  • Proposed shared policy: The PR head adds isUnsafeGatewayTailscaleNoAuth, which matches authMode === "none" with tailscaleMode === "serve" || "funnel", and centralizes the operator-facing error messages. (src/shared/gateway-tailscale-auth-policy.ts:3, 0b306e8e00eb)
  • Runtime enforcement: The PR head calls the shared policy immediately after the existing Funnel password guard and before the Tailscale loopback bind guard, so runtime startup fails closed for no-auth Serve/Funnel. (src/gateway/server-runtime-config.ts:141, 0b306e8e00eb)

Likely related people:

  • steipete: Current-main blame for gateway runtime auth checks, auth resolution, no-auth authorization, config validation, and install-token preflight points to Peter Steinberger’s recent gateway/auth refactor; the PR head is also authored by the same handle. (role: recent area contributor; confidence: high; commits: c89298f9f800, 8e5183c60d4a, 0b306e8e00eb; files: src/gateway/server-runtime-config.ts, src/gateway/auth.ts, src/gateway/auth-resolve.ts)
  • coygeek: The linked security report and older open runtime-only PR provide the original reproduction and competing fix context for the same gateway Tailscale no-auth issue. (role: reporter and prior fix proposer; confidence: medium; commits: 6bd6ea52f8dc; files: src/gateway/server-runtime-config.ts, src/gateway/server-runtime-config.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@steipete steipete force-pushed the codex/fix-tailscale-serve-none-auth branch from e46da14 to 0b306e8 Compare May 27, 2026 12:58
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime commands Command implementations size: S maintainer Maintainer-authored PR labels May 27, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P0 Emergency: data loss, security bypass, crash loop, or unusable core runtime. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. labels May 27, 2026
@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@steipete steipete merged commit dc5954b into main May 27, 2026
165 of 176 checks passed
@steipete steipete deleted the codex/fix-tailscale-serve-none-auth branch May 27, 2026 13:11
@steipete

Copy link
Copy Markdown
Contributor Author

Landed via squash as dc5954b.

Proof:

  • focused Vitest: node scripts/run-vitest.mjs src/config/config.gateway-tailscale-bind.test.ts src/gateway/server-runtime-config.test.ts src/commands/doctor-gateway-auth-token.test.ts
  • autoreview: .agents/skills/autoreview/scripts/autoreview --mode local, clean
  • remote changed check: node scripts/crabbox-wrapper.mjs run --shell -- "pnpm check:changed", run_5a999c1e11c0, exit 0
  • GitHub PR checks clean on 0b306e8 before merge; earlier failures were GitHub checkout/diff infrastructure and cleared after rebase.

@felipebridge

Copy link
Copy Markdown

Excellent hardening pass. The strongest part here is that the policy was enforced consistently across validation, install-token preflight, and runtime startup instead of patching only the runtime path. That closes the class of “configuration accepted but later unsafe in execution” issues very cleanly.

Also smart to preserve the loopback-only auth.mode=none case rather than overcorrecting and breaking legitimate local workflows. The fail-closed posture for Serve/Funnel exposure is absolutely the right tradeoff here.

I especially liked the explicit regression coverage around overrides and service-plan generation — those edge paths are exactly where auth-policy drift tends to reappear later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations gateway Gateway runtime maintainer Maintainer-authored PR merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P0 Emergency: data loss, security bypass, crash loop, or unusable core runtime. rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Tailscale serve + auth.mode=none exposes gateway to full Tailnet without authentication

2 participants