fix(backend): honour kubeconfig CA under Bun's native fetch#319
Merged
robert-cronin merged 2 commits intoJun 9, 2026
Merged
Conversation
- Add `BunTlsHttpLibrary` and a `makeApiClient` helper in `kubeconfig.ts`. The SDK's default `IsomorphicFetchHttpLibrary` passes the kubeconfig CA via a Node.js `https.Agent`, which Bun's native `fetch` ignores — it only honours TLS material on the per-request `tls` option. This caused `UNABLE_TO_VERIFY_LEAF_SIGNATURE` on every request to clusters with a private CA (e.g. AKS). The subclass re-injects `ca`/`cert`/`key`/`rejectUnauthorized` via `tls`; auth headers are still applied upstream via `authMethods`. - Route all client construction in `kubernetes.ts`, `auth.ts`, `autoscaler.ts`, `config.ts`, `registry.ts`, and `secrets.ts` through `makeApiClient(...)` instead of `kc.makeApiClient(...)`, making the helper the single source of truth for Bun-safe TLS. Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes Kubernetes API authentication failures that occur when the backend runs under Bun against clusters using a private CA (e.g., AKS). Bun's native fetch ignores the https.Agent that @kubernetes/client-node normally uses to supply TLS material, causing UNABLE_TO_VERIFY_LEAF_SIGNATURE errors. The fix introduces a custom HTTP library that re-injects TLS material through Bun's per-request tls option.
Changes:
- Adds
BunTlsHttpLibrary(extendsIsomorphicFetchHttpLibrary) that extracts kubeconfig TLS material viaapplyToHTTPSOptionsand passes it through the Bun-nativetlsfetch option, mirroring the existingproxyServiceRequestpattern. - Adds a
makeApiClient(kc, ApiClass)factory function as a drop-in replacement forkc.makeApiClient(...)that wires the custom HTTP library into the SDK configuration. - Migrates all backend service client construction to use the new helper.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| backend/src/lib/kubeconfig.ts | Adds BunTlsHttpLibrary class and makeApiClient helper function |
| backend/src/services/auth.ts | Switches to makeApiClient for AuthenticationV1Api |
| backend/src/services/autoscaler.ts | Switches to makeApiClient for CoreV1Api, AppsV1Api, CustomObjectsApi |
| backend/src/services/config.ts | Switches to makeApiClient for CoreV1Api |
| backend/src/services/kubernetes.ts | Switches to makeApiClient for all API clients including per-user token clients |
| backend/src/services/registry.ts | Switches to makeApiClient for CoreV1Api, AppsV1Api |
| backend/src/services/secrets.ts | Switches to makeApiClient for CoreV1Api |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The TLS-extraction logic was duplicated across two call sites and dropped some kubeconfig fields. - Extract a shared `kubeConfigToBunTls()` helper as the single source of truth; route both `BunTlsHttpLibrary.send()` and `proxyServiceRequest()` through it so the two TLS paths can no longer drift. - Forward the SNI override, mapping the Node-side `servername` to Bun camelCase `serverName` (previously dropped, breaking clusters that set `tls-server-name`). - Forward the client-key `passphrase`, previously dropped. - Resolve the TLS material once per client and cache it, so `applyToHTTPSOptions` no longer re-runs the auth/cert pipeline (e.g. exec credential plugins) on every request. - Document `pfx` and `cluster.proxy-url` as known Bun limitations, note the `@kubernetes/client-node@1.4.0` SDK coupling, and flag the multipart-body caveat. - Add `kubeconfig.test.ts` regression tests, including a guard that the default path never disables certificate verification. Signed-off-by: Suraj Deshmukh <suraj.deshmukh@microsoft.com> backend/src/lib/kubeconfig.ts # modified: backend/src/services/kubernetes.ts # PR-319-review-report.md # PR-319-reviews/ #
robert-cronin
approved these changes
Jun 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes Kubernetes authentication failures in the backend when running under Bun against clusters that use a private CA (e.g. AKS). Every API-server request was failing with
UNABLE_TO_VERIFY_LEAF_SIGNATUREbecause Bun's nativefetchignores the kubeconfig CA that@kubernetes/client-nodesupplies via a Node.jshttps.Agent. This PR re-injects the TLS material through the per-requesttlsoption that Bun honours, routes all backend client construction through a single Bun-safe helper, and hardens that helper (shared TLS mapping, SNI, cached extraction) with regression tests.Get a visual overview of the PR here: https://suraj.io/share/prs/github.com/kaito-project/airunway/pull/319/PR-319-review-dashboard.html
Type of Change
Changes Made
Core fix — honour the kubeconfig CA under Bun (
d44f5a0)BunTlsHttpLibrarytobackend/src/lib/kubeconfig.ts— a subclass ofk8s.IsomorphicFetchHttpLibrarythat overridessend()to call Bun's nativefetchdirectly, translating the kubeconfig's TLS material into the per-requesttlsoption Bun understands. Auth headers (Bearer tokens, etc.) are still applied upstream by the generated client viaauthMethods, so only the TLS material is re-injected.makeApiClient(kc, ApiClass)helper — a drop-in replacement forkc.makeApiClient(...)that reproduces the SDK's wiring (createConfiguration+ServerConfiguration) but swaps inBunTlsHttpLibrary.makeApiClient(...)across the backend services:auth.ts,autoscaler.ts,config.ts,kubernetes.ts,registry.ts, andsecrets.ts— including the per-user token clients built fromcreateUserKubeConfig(...).Hardening from PR review (
c9157b7)kubeConfigToBunTls()helper as the single source of truth for kubeconfig → Bun TLS mapping, and route bothBunTlsHttpLibrary.send()andproxyServiceRequest()(kubernetes.ts) through it so the two TLS paths can no longer drift.servernameto Bun's camelCaseserverName(previously dropped — broke clusters that settls-server-name).passphrase, previously dropped.applyToHTTPSOptionsno longer re-runs the auth/cert pipeline (e.g. exec credential plugins, disk reads) on every request.pfxandcluster.proxy-urlas known Bun limitations, note the@kubernetes/client-node@1.4.0SDK coupling, and flag the multipart-body caveat.backend/src/lib/kubeconfig.test.tsregression tests, including a guard that the default path never disables certificate verification.Testing
bun run test)Automated:
bun run testpasses — 864 tests, 0 failing (backend 736, frontend 128), including 13 new tests for the Bun TLS shim. The new tests cover:makeApiClientthrows with no cluster;send()passes TLS material and theAuthorizationheader tofetch;skipTLSVerifymaps torejectUnauthorized:false; the default path leaves verification on; SNIserverNamemapping; non-2xx still yields aResponseContext; and the TLS material is resolved only once across requests.Manual: Verified end-to-end against a live AKS cluster (
aks-0, private CA) via the running Web UI:GET /api/cluster/status→200with{"connected":true,"clusterName":"aks-0","providerInstallation":{"installed":true,"crdFound":true}}GET /api/deployments→200(CRD list succeeds; renders "No deployments yet")GET /api/installation/gpu-capacity→200UNABLE_TO_VERIFY_LEAF_SIGNATURE/ certificate errors in backend logs or browser console.Manual Testing w/ & w/o changes
You will start seeing logs like this in the terminal:
@airunway/backend dev $ bun --watch src/index.ts │ [39 lines elided] │ {"level":"error","time":"2026-06-08T22:12:13.274Z","pid":38386,"hostname":"Users-MacBook-Pro.local","error":{"code":"UNABLE_TO_VERIFY_LEAF_SIGNATURE","path":"https://suraj-cluster.hcp.southcentralus.azmk8s.io/api/v1/namespaces/kube-system/configmaps/cluster-autoscaler-status","errno":0},"msg":"Error getting autoscaler status"} │ {"level":"error","time":"2026-06-08T22:12:13.434Z","pid":38386,"hostname":"Users-MacBook-Pro.local","error":{"code":"UNABLE_TO_VERIFY_LEAF_SIGNATURE","path":"https://suraj-cluster.hcp.southcentralus.azmk8s.io/apis/apps/v1/namespaces/kube-system/deployments?labelSelector=app%3Dcluster-autoscaler","errno":0},"msg":"Error detecting cluster-autoscaler"} │ {"level":"error","time":"2026-06-08T22:12:13.440Z","pid":38386,"hostname":"Users-MacBook-Pro.local","operation":"listPVCs","namespace":"dynamo-system","errorMessage":"unable to verify the first certificate","statusCode":500,"rawError":{"message":"unable to verify the first certificate","stack":"Error: unable to verify the first certificate\n at async fetch (node-fetch:96:41)\n at async withRetry (/Users/user/code/kubeairunway/backend/src/lib/retry.ts:96:20)\n at async listPVCs (/Users/user/code/kubeairunway/backend/src/services/kubernetes.ts:2204:28)\n at async <anonymous> (/Users/user/code/kubeairunway/backend/src/routes/deployments.ts:1053:44)\n at async dispatch (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/compose.js:22:23)\n at async <anonymous> (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/validator/validator.js:81:18)\n at async dispatch (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/compose.js:22:23)\n at async dispatch (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/compose.js:22:23)\n at async <anonymous> (/Users/user/code/kubeairunway/backend/src/hono-app.ts:119:9)\n at async dispatch (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/compose.js:22:23)"},"msg":"Kubernetes API error: unable to verify the first certificate"} │ {"level":"error","time":"2026-06-08T22:12:13.441Z","pid":38386,"hostname":"Users-MacBook-Pro.local","error":{"status":500},"stack":"Error: Failed to list storage disks: unable to verify the first certificate\n at <anonymous> (/Users/user/code/kubeairunway/backend/src/routes/deployments.ts:1061:17)\n at async dispatch (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/compose.js:22:23)\n at async <anonymous> (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/validator/validator.js:81:18)\n at async dispatch (/Users/user/code/kubeairunway/node_modules/.bun/hono@4.11.9/node_modules/hono/dist/compose.js:22:23)\n at processTicksAndRejections (native:7:39)","msg":"Error: Failed to list storage disks: unable to verify the first certificate"} │ {"level":"info","time":"2026-06-08T22:12:21.390Z","pid":38386,"hostname":"Users-MacBook-Pro.local","method":"GET","url":"http://localhost:3001/api/installation/gpu-capacity","msg":"GET /api/installation/gpu-capacity"} │ {"level":"info","time":"2026-06-08T22:12:21.402Z","pid":38386,"hostname":"Users-MacBook-Pro.local","method":"GET","url":"http://localhost:3001/api/cluster/status","msg":"GET /api/cluster/status"} │ {"level":"error","time":"2026-06-08T22:12:21.543Z","pid":38386,"hostname":"Users-MacBook-Pro.local","error":{"code":"UNABLE_TO_VERIFY_LEAF_SIGNATURE","path":"https://suraj-cluster.hcp.southcentralus.azmk8s.io/api/v1/nodes","errno":0},"msg":"Error getting cluster GPU capacity"} │ {"level":"info","time":"2026-06-08T22:12:51.561Z","pid":38386,"hostname":"Users-MacBook-Pro.local","method":"GET","url":"http://localhost:3001/api/installation/gpu-capacity","msg":"GET /api/installation/gpu-capacity"} │ {"level":"info","time":"2026-06-08T22:12:51.577Z","pid":38386,"hostname":"Users-MacBook-Pro.local","method":"GET","url":"http://localhost:3001/api/cluster/status","msg":"GET /api/cluster/status"} │ {"level":"error","time":"2026-06-08T22:12:51.720Z","pid":38386,"hostname":"Users-MacBook-Pro.local","error":{"code":"UNABLE_TO_VERIFY_LEAF_SIGNATURE","path":"https://suraj-cluster.hcp.southcentralus.azmk8s.io/api/v1/nodes","errno":0},"msg":"Error getting cluster GPU capacity"}Now go to http://localhost:5173/ and you should see Disconnected state on the top right.
Now with this PR's changes you should see a connected state:
Checklist
bun run lintAdditional Notes
backend/src/); no public API, CRD, or frontend changes.package.jsonandbun.lockare untouched.bun run lintis not ticked because it fails for a pre-existing, unrelated reason: ESLint v9+ requires a flateslint.config.js, which is absent on this branch (the migration lives in a separate PR). It is not caused by these changes.BunTlsHttpLibrary/makeApiClient.pfxclient certs andcluster.proxy-urlare not honoured under Bun'sfetch.