feat(auth): add managed API key support#190
Conversation
📝 WalkthroughWalkthroughAdds a managed auth-keys subsystem (types, stores, service, factory), admin APIs to create/list/deactivate keys, bearer-token authenticator middleware integrated with master key handling, audit-log propagation/storage of auth key IDs across readers/stores, and tests for service, admin, auth middleware, and audit logging. Changes
Sequence DiagramsequenceDiagram
participant Client
participant AuthMiddleware as Auth Middleware
participant AuthService as Auth Service
participant Store as Auth Store
participant Handler
participant AuditLog
Client->>AuthMiddleware: HTTP request + Authorization header
AuthMiddleware->>AuthMiddleware: Check master key
alt Master key valid
AuthMiddleware->>Handler: proceed
else Attempt managed-key auth
AuthMiddleware->>AuthService: Authenticate(token)
AuthService->>Store: Read cached keys / refresh as needed
AuthService->>AuthService: parse token, hash secret, lookup active key
alt Valid active key
AuthService-->>AuthMiddleware: authKeyID
AuthMiddleware->>AuditLog: EnrichEntryWithAuthKeyID(authKeyID)
AuthMiddleware->>Handler: proceed (context contains authKeyID)
else Invalid / inactive / expired
AuthService-->>AuthMiddleware: error
AuthMiddleware->>AuditLog: Enrich auth failure
AuthMiddleware-->>Client: 401 Authentication Error
end
end
Handler->>AuditLog: Write log entry (includes AuthKeyID if set)
AuditLog->>AuditLog: Persist to DB (includes auth_key_id column)
AuditLog-->>Handler: ack
Handler-->>Client: Response
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 13
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/admin/handler_authkeys_test.go`:
- Around line 71-93: The test
TestAuthKeyEndpointsReturn503WhenServiceUnavailable misses asserting
DeactivateAuthKey returns 503 when h.authKeys is nil and also declares an unused
variable e; update the test to remove the unused echo variable (e) and add a
DeactivateAuthKey case that constructs an HTTP request (e.g., DELETE
/admin/api/v1/auth-keys/:id) and context, calls
h.DeactivateAuthKey(deactivateCtx), then assert deactivateRec.Code ==
http.StatusServiceUnavailable, keeping the existing pattern used for
ListAuthKeys and CreateAuthKey and referencing NewHandler and DeactivateAuthKey
to locate the code to change.
In `@internal/admin/handler.go`:
- Around line 736-738: The handler currently treats a nil IssuedKey as a valid
204 by returning when issued == nil; instead treat this as an unexpected service
bug: after calling h.authKeys.Create, if issued == nil and err == nil, log or
create a descriptive error and return an internal server error (e.g., HTTP 500)
rather than No Content. Update the code around the issued check in the handler
that calls h.authKeys.Create to surface the failure (include context like
request identifiers) and return an error response so the unexpected nil is
visible to callers/operators.
In `@internal/auditlog/store_postgresql_test.go`:
- Around line 67-68: The test's fatal error message references the wrong index:
when asserting the value of args[18] (in the test that checks "log-2") update
the t.Fatalf call inside the block that reads `if got := args[18]; got !=
"log-2" { ... }` to report `args[18]` (e.g., `t.Fatalf("args[18] = %v, want
log-2", got)`) so the message matches the checked index and value.
In `@internal/authkeys/service.go`:
- Around line 97-105: The Enabled() method currently returns true if any rows
exist in s.snapshot.byID, which can be true for only expired or deactivated
keys; change Enabled() (in type Service) to instead iterate the snapshot
(s.snapshot.byID) under RLock and return true only if at least one key is
currently usable by evaluating its fields against time.Now() (e.g., not
deactivated and not expired based on the key's expiry/active fields). Ensure you
use time.Now() for the check so AuthMiddlewareWithAuthenticator will not treat
snapshots with only unusable keys as enabling auth.
- Around line 177-182: When persisting changes (e.g., in s.store.Create and
s.store.Deactivate) don't make the API return an error if the subsequent
s.Refresh() reconciliation fails; instead apply the corresponding mutation to
the in-memory snapshot first (update the in-memory keys/snapshot state inside
the service), then call s.Refresh() as a best-effort reconciliation and
log/ignore refresh errors while returning success for the write. Update the
create path (where s.store.Create is called) and the deactivate path (where
s.store.Deactivate is called, also referenced around the other block at lines
~203-209) to mutate the in-memory snapshot before invoking s.Refresh(), and only
surface errors from the persistent store operations while treating s.Refresh()
failures as non-fatal reconciliation issues.
- Around line 215-243: The Authenticate method returns early when a secret is
found in snapshot.activeByHash, which bypasses per-request expiry checks and
allows expired keys to authenticate until the next Refresh; update Authenticate
(and/or snapshot population logic) so that finding an entry in activeByHash does
not skip time-sensitive checks: after locating the active entry in
snapshot.activeByHash within Authenticate, re-check the corresponding key's
ExpiresAt (and Enabled/DeactivatedAt) against time.Now().UTC() before returning,
or remove time-dependent "active" caching and instead always consult
snapshot.bySecretHash (or include expiration metadata in the active entry) so
expiry is validated on every request.
In `@internal/authkeys/store_mongodb.go`:
- Around line 95-105: The Deactivate method currently uses "$max" which can
overwrite a previously set deactivated_at; change the update to use an
aggregation update pipeline so deactivated_at is assigned only when missing
(preserving the original timestamp). In the MongoDBStore.Deactivate call
(s.collection.UpdateOne with mongoAuthKeyIDFilter), replace the document-style
update that contains "$max" with an aggregation pipeline update that sets
"enabled" to false and "updated_at" to now.UTC() and sets "deactivated_at" using
an expression like $ifNull/$cond to keep the existing value if present (i.e.,
deactivated_at = ifNull("$deactivated_at", now.UTC())). Ensure the UpdateOne
call still targets the same filter and behavior but no longer mutates an
existing deactivated_at.
In `@internal/authkeys/store_postgresql.go`:
- Around line 43-45: Remove the redundant unique index creation for secret_hash:
delete the entry that creates `idx_auth_keys_secret_hash` from the index list so
you only keep `CREATE INDEX IF NOT EXISTS idx_auth_keys_enabled ON
auth_keys(enabled)`; this avoids duplicating the `UNIQUE` constraint already
declared on `auth_keys(secret_hash)`.
In `@internal/authkeys/store_sqlite.go`:
- Around line 22-48: The auth_keys table schema already declares secret_hash
TEXT NOT NULL UNIQUE, so remove the redundant explicit unique index creation
`CREATE UNIQUE INDEX IF NOT EXISTS idx_auth_keys_secret_hash ON
auth_keys(secret_hash)` from the index loop; keep the UNIQUE constraint in the
table definition and leave the other indexes (`idx_auth_keys_enabled`,
`idx_auth_keys_created_at`) intact to avoid duplicate indexes on the secret_hash
column.
In `@internal/authkeys/types.go`:
- Around line 38-42: The CreateInput struct fields (Name, Description,
ExpiresAt) lack JSON tags so snake_case requests like "expires_at" won't bind to
ExpiresAt; update the CreateInput type to add JSON tags mapping "name",
"description", and "expires_at" to the respective fields (e.g., ExpiresAt ->
"expires_at") so standard JSON decoding correctly populates the struct during
admin key creation.
In `@internal/server/auth.go`:
- Around line 81-82: The audit currently uses authFailureMessage(err) and that
same string is returned to clients; change the flow so the detailed reason
(authFailureMessage(err)) is only used to enrich the audit/logging context,
while the JSON response uses a normalized message (e.g., "authentication
failed"). Concretely, update the code paths calling authenticationError(c,
authFailureMessage(err)) (and similar calls in the 91-107 block) to: 1) call
your audit/enrichment with authFailureMessage(err), and 2) call
authenticationError(c, "authentication failed") to construct the authErr
returned via authErr.ToJSON(), leaving the rich reason out of the response body
but preserved in logs.
In `@internal/server/http.go`:
- Around line 135-139: The current condition that appends "/admin/api/v1/*" to
authSkipPaths allows unauthenticated admin access when cfg.MasterKey == "" even
if no authenticator exists; change the guard so the skip is only applied when an
authentication mechanism is present (e.g., require cfg.Authenticator != nil or a
helper like cfg.HasAuthenticator() in the if condition alongside
cfg.AdminEndpointsEnabled and cfg.AdminHandler) and update the comment to note
this is only for bootstrapping; ensure you reference authSkipPaths,
cfg.MasterKey, cfg.Authenticator, cfg.AdminEndpointsEnabled and cfg.AdminHandler
when making the change.
In `@tests/integration/dbassert/auditlog.go`:
- Around line 62-65: The DB scan is currently scanning auth_key_id into a plain
string which will error on NULL; change the scan to use sql.NullString (or a
local *string/nullable carrier) when reading rows in the audit log query loop,
then set entry.AuthKeyID = authKeyID.String if authKeyID.Valid (or nil/empty
otherwise). Update the scan call that currently contains &entry.AuthKeyID among
the targets and the subsequent assignment to normalize NULL -> "" (or keep nil)
so NULL auth_key_id values are handled safely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: a13ad73e-a404-44bd-8a3b-546918fffd12
📒 Files selected for processing (29)
internal/admin/dashboard/templates/index.htmlinternal/admin/handler.gointernal/admin/handler_authkeys_test.gointernal/app/app.gointernal/auditlog/auditlog.gointernal/auditlog/auditlog_test.gointernal/auditlog/middleware.gointernal/auditlog/reader_mongodb.gointernal/auditlog/reader_postgresql.gointernal/auditlog/reader_sqlite.gointernal/auditlog/store_mongodb.gointernal/auditlog/store_postgresql.gointernal/auditlog/store_postgresql_test.gointernal/auditlog/store_sqlite.gointernal/auditlog/stream_wrapper.gointernal/authkeys/factory.gointernal/authkeys/service.gointernal/authkeys/service_test.gointernal/authkeys/store.gointernal/authkeys/store_mongodb.gointernal/authkeys/store_postgresql.gointernal/authkeys/store_sqlite.gointernal/authkeys/types.gointernal/core/context.gointernal/server/auth.gointernal/server/auth_test.gointernal/server/http.gointernal/server/http_test.gotests/integration/dbassert/auditlog.go
| // Enabled reports whether managed auth keys should be enforced. | ||
| func (s *Service) Enabled() bool { | ||
| if s == nil { | ||
| return false | ||
| } | ||
| s.mu.RLock() | ||
| defer s.mu.RUnlock() | ||
| return len(s.snapshot.byID) > 0 | ||
| } |
There was a problem hiding this comment.
Enabled() should reflect usable keys, not stored rows.
AuthMiddlewareWithAuthenticator uses this to decide whether auth can be bypassed. Returning true for a snapshot that only contains expired or deactivated keys makes a deployment with no master key permanently unreachable, which breaks the recovery path this PR is trying to preserve. Evaluate the current snapshot against time.Now() instead of len(s.snapshot.byID).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/authkeys/service.go` around lines 97 - 105, The Enabled() method
currently returns true if any rows exist in s.snapshot.byID, which can be true
for only expired or deactivated keys; change Enabled() (in type Service) to
instead iterate the snapshot (s.snapshot.byID) under RLock and return true only
if at least one key is currently usable by evaluating its fields against
time.Now() (e.g., not deactivated and not expired based on the key's
expiry/active fields). Ensure you use time.Now() for the check so
AuthMiddlewareWithAuthenticator will not treat snapshots with only unusable keys
as enabling auth.
| type CreateInput struct { | ||
| Name string | ||
| Description string | ||
| ExpiresAt *time.Time | ||
| } |
There was a problem hiding this comment.
Add JSON tags to CreateInput to preserve snake_case request compatibility.
Without tags, expires_at won’t bind to ExpiresAt in standard JSON decoding. This can silently drop expiry during admin key creation (Line 41).
🔧 Proposed fix
type CreateInput struct {
- Name string
- Description string
- ExpiresAt *time.Time
+ Name string `json:"name"`
+ Description string `json:"description"`
+ ExpiresAt *time.Time `json:"expires_at"`
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| type CreateInput struct { | |
| Name string | |
| Description string | |
| ExpiresAt *time.Time | |
| } | |
| type CreateInput struct { | |
| Name string `json:"name"` | |
| Description string `json:"description"` | |
| ExpiresAt *time.Time `json:"expires_at"` | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/authkeys/types.go` around lines 38 - 42, The CreateInput struct
fields (Name, Description, ExpiresAt) lack JSON tags so snake_case requests like
"expires_at" won't bind to ExpiresAt; update the CreateInput type to add JSON
tags mapping "name", "description", and "expires_at" to the respective fields
(e.g., ExpiresAt -> "expires_at") so standard JSON decoding correctly populates
the struct during admin key creation.
| // When no bootstrap master key is configured, keep admin APIs reachable so | ||
| // the dashboard can recover managed-key access instead of locking itself out. | ||
| if cfg != nil && cfg.MasterKey == "" && cfg.AdminEndpointsEnabled && cfg.AdminHandler != nil { | ||
| authSkipPaths = append(authSkipPaths, "/admin/api/v1/*") | ||
| } |
There was a problem hiding this comment.
Security: Admin APIs exposed without authentication when no master key is configured.
When cfg.MasterKey == "" and cfg.AdminEndpointsEnabled is true, /admin/api/v1/* is added to authSkipPaths. However, if cfg.Authenticator is also nil (no managed keys configured), all admin API endpoints become completely unauthenticated—including sensitive operations like creating/deactivating auth keys.
Consider gating this skip-path logic on the presence of an authenticator, or adding a check that at least one auth mechanism exists:
- if cfg != nil && cfg.MasterKey == "" && cfg.AdminEndpointsEnabled && cfg.AdminHandler != nil {
+ if cfg != nil && cfg.MasterKey == "" && cfg.Authenticator != nil && cfg.AdminEndpointsEnabled && cfg.AdminHandler != nil {
authSkipPaths = append(authSkipPaths, "/admin/api/v1/*")
}Alternatively, if intentional for bootstrap scenarios, document the security implications clearly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // When no bootstrap master key is configured, keep admin APIs reachable so | |
| // the dashboard can recover managed-key access instead of locking itself out. | |
| if cfg != nil && cfg.MasterKey == "" && cfg.AdminEndpointsEnabled && cfg.AdminHandler != nil { | |
| authSkipPaths = append(authSkipPaths, "/admin/api/v1/*") | |
| } | |
| // When no bootstrap master key is configured, keep admin APIs reachable so | |
| // the dashboard can recover managed-key access instead of locking itself out. | |
| if cfg != nil && cfg.MasterKey == "" && cfg.Authenticator != nil && cfg.AdminEndpointsEnabled && cfg.AdminHandler != nil { | |
| authSkipPaths = append(authSkipPaths, "/admin/api/v1/*") | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/server/http.go` around lines 135 - 139, The current condition that
appends "/admin/api/v1/*" to authSkipPaths allows unauthenticated admin access
when cfg.MasterKey == "" even if no authenticator exists; change the guard so
the skip is only applied when an authentication mechanism is present (e.g.,
require cfg.Authenticator != nil or a helper like cfg.HasAuthenticator() in the
if condition alongside cfg.AdminEndpointsEnabled and cfg.AdminHandler) and
update the comment to note this is only for bootstrapping; ensure you reference
authSkipPaths, cfg.MasterKey, cfg.Authenticator, cfg.AdminEndpointsEnabled and
cfg.AdminHandler when making the change.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/auditlog/store_postgresql_test.go (1)
58-81: 🧹 Nitpick | 🔵 TrivialAdd direct assertions for
auth_key_idbind slots.The test validates SQL text and shifted indices, but it still doesn’t assert the new
auth_key_idargument positions. A bind-order swap with nearby fields (likeclient_ip) could pass unnoticed.Proposed test hardening
if got := args[8]; got != CacheTypeExact { t.Fatalf("args[8] = %v, want %q", got, CacheTypeExact) } + if got := args[11]; got != "" { + t.Fatalf("args[11] = %v, want empty auth_key_id", got) + } + if got := args[12]; got != "127.0.0.1" { + t.Fatalf("args[12] = %v, want 127.0.0.1", got) + } if got := args[18]; got != "log-2" { t.Fatalf("args[18] = %v, want log-2", got) } + if got := args[29]; got != "" { + t.Fatalf("args[29] = %v, want empty auth_key_id", got) + } + if got := args[30]; got != "10.0.0.1" { + t.Fatalf("args[30] = %v, want 10.0.0.1", got) + } if got := string(args[17].([]byte)); got != `{"user_agent":"test-agent"}` { t.Fatalf("args[17] = %q, want %q", got, `{"user_agent":"test-agent"}`) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/auditlog/store_postgresql_test.go` around lines 58 - 81, Add explicit assertions in the test (internal/auditlog/store_postgresql_test.go) that verify the bind-slot values for the auth_key_id parameter so a bind-order regression is caught; locate where args is being examined (the existing checks around args[0], args[8], args[17], args[18], args[26], args[35]) and add checks for the auth_key_id slots (the index/indices adjacent to client_ip and other nearby fields) asserting the exact expected value or nil and the expected type (e.g., string, []byte, or nil) to ensure auth_key_id's position and value are validated.
♻️ Duplicate comments (1)
internal/authkeys/service.go (1)
98-106:⚠️ Potential issue | 🟠 Major
Enabled()still returns true for snapshots with only expired/deactivated keys.The method checks
len(s.snapshot.byID) > 0but doesn't verify that at least one key is currently usable. This could causeAuthMiddlewareWithAuthenticatorto enforce auth when no valid keys exist, potentially making the deployment unreachable if all keys are expired/deactivated and no master key is configured.Consider checking
len(s.snapshot.activeByHash) > 0or iterating to verify at least one key isActive(now).🔧 Proposed fix using activeByHash
func (s *Service) Enabled() bool { if s == nil { return false } s.mu.RLock() defer s.mu.RUnlock() - return len(s.snapshot.byID) > 0 + return len(s.snapshot.activeByHash) > 0 }Note: Using
activeByHashis a fast approximation. If expiry can occur between refreshes, you may need to evaluateActive(time.Now().UTC())on at least one key for precise behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/authkeys/service.go` around lines 98 - 106, Enabled() currently returns true if any key exists in s.snapshot.byID which can be true even when all keys are expired/deactivated; update Enabled() to check for usable keys instead (either return len(s.snapshot.activeByHash) > 0 or iterate s.snapshot.byID and call key.Active(time.Now().UTC()) and return true on the first active key) so that AuthMiddlewareWithAuthenticator will only enforce auth when at least one key is currently usable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/authkeys/store_postgresql.go`:
- Around line 120-125: Replace the direct equality check against pgx.ErrNoRows
with errors.Is to handle wrapped errors: in the error-handling block that
returns AuthKey{}, ErrNotFound (the code comparing err == pgx.ErrNoRows), change
it to use errors.Is(err, pgx.ErrNoRows) and ensure the "errors" package is
imported; keep the existing return semantics for ErrNotFound and other errors.
---
Outside diff comments:
In `@internal/auditlog/store_postgresql_test.go`:
- Around line 58-81: Add explicit assertions in the test
(internal/auditlog/store_postgresql_test.go) that verify the bind-slot values
for the auth_key_id parameter so a bind-order regression is caught; locate where
args is being examined (the existing checks around args[0], args[8], args[17],
args[18], args[26], args[35]) and add checks for the auth_key_id slots (the
index/indices adjacent to client_ip and other nearby fields) asserting the exact
expected value or nil and the expected type (e.g., string, []byte, or nil) to
ensure auth_key_id's position and value are validated.
---
Duplicate comments:
In `@internal/authkeys/service.go`:
- Around line 98-106: Enabled() currently returns true if any key exists in
s.snapshot.byID which can be true even when all keys are expired/deactivated;
update Enabled() to check for usable keys instead (either return
len(s.snapshot.activeByHash) > 0 or iterate s.snapshot.byID and call
key.Active(time.Now().UTC()) and return true on the first active key) so that
AuthMiddlewareWithAuthenticator will only enforce auth when at least one key is
currently usable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8a276386-d595-4344-9a42-bd00650e848b
📒 Files selected for processing (11)
internal/admin/handler.gointernal/admin/handler_authkeys_test.gointernal/auditlog/store_postgresql_test.gointernal/authkeys/service.gointernal/authkeys/service_test.gointernal/authkeys/store_mongodb.gointernal/authkeys/store_postgresql.gointernal/authkeys/store_sqlite.gointernal/server/auth.gointernal/server/auth_test.gotests/integration/dbassert/auditlog.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/authkeys/store_postgresql.go`:
- Around line 48-50: The DDL error omits which index statement failed: when
calling pool.Exec(ctx, index) update the error wrap to include the index (or its
name) for context (e.g., change the fmt.Errorf call around pool.Exec to include
the index string or index name variable), so replace the current
fmt.Errorf("failed to create auth_keys index: %w", err) with one that embeds the
index/statement (for example "failed to create auth_keys index (%s): %w", index,
err) to make failures immediately diagnosable; locate the pool.Exec(ctx, index)
call and the surrounding error return in store_postgresql.go to apply this
change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 91f064af-5b0b-4e4e-b2c7-dad3b2b1df2c
📒 Files selected for processing (2)
internal/auditlog/store_postgresql_test.gointernal/authkeys/store_postgresql.go
Summary
Testing
Summary by CodeRabbit