Skip to content

feat: add resolver for token verification#7

Merged
lakhansamani merged 1 commit intomainfrom
implement-token-verification-resolver
Jul 13, 2021
Merged

feat: add resolver for token verification#7
lakhansamani merged 1 commit intomainfrom
implement-token-verification-resolver

Conversation

@lakhansamani
Copy link
Copy Markdown
Contributor

Resolves #6

@lakhansamani lakhansamani merged commit 336fe10 into main Jul 13, 2021
@lakhansamani lakhansamani deleted the implement-token-verification-resolver branch January 14, 2022 12:02
lakhansamani added a commit that referenced this pull request Jan 15, 2022
…lver

feat: add resolver for token verification
harisato added a commit to harisato/authorizer that referenced this pull request Oct 8, 2025
lakhansamani added a commit that referenced this pull request Apr 7, 2026
…migration (#7, #8)

#7 — OTPs stored in plaintext
schemas/otp.go.Otp held the raw 6-digit code. An offline DB dump revealed
every active password-reset / signup-verification / MFA code, allowing
account takeover for any user with an outstanding OTP request.

OTPs are short-lived (minutes) so a reversible primitive is unnecessary —
the verifier only needs to recompute the digest from the candidate. Add:
- crypto.HashOTP(plain, key)        — HMAC-SHA256 hex
- crypto.VerifyOTPHash(plain, stored, key) — constant-time compare

All write sites now hash before UpsertOTP. The plaintext stays in a local
variable for the email/SMS body. Affected handlers:
- internal/graphql/login.go (generateOTP closure: store hash, restore
  plaintext on the returned struct so callers can read otpData.Otp)
- internal/graphql/signup.go (mobile SMS verification)
- internal/graphql/forgot_password.go (mobile reset OTP)
- internal/graphql/resend_otp.go (same closure pattern as login)

Verify sites accept BOTH hash and legacy plaintext for the duration of a
rolling upgrade — old in-flight rows expire within minutes anyway:
- internal/graphql/verify_otp.go
- internal/graphql/reset_password.go

Migration: NONE required. All in-flight plaintext rows naturally expire
during the upgrade window.

#8 — TOTP secrets stored in plaintext
schemas/authenticators.go.Secret held the base32 TOTP shared secret. TOTP
secrets are long-lived (enrolled once, used forever) so a DB dump enables
permanent MFA bypass.

Add a versioned ciphertext format (so migration is idempotent and a future
algorithm rotation is straightforward):
- crypto.EncryptTOTPSecret(plain, key) → "enc:v1:" + AES-256-GCM ciphertext
- crypto.DecryptTOTPSecret(stored, key) — transparent legacy passthrough
- crypto.IsEncryptedTOTPSecret(stored) — prefix check used by lazy migration

The TOTP authenticator now:
- Encrypts the secret in Generate() before storing
- Decrypts in Validate() before computing the expected code
- Lazy-migrates legacy plaintext rows on the first successful Validate by
  re-encrypting and updating the row in place. The lazy approach avoids
  having to add a ListAuthenticators API to all 13 storage providers.
  DEPRECATED: the lazy migration code is intended to be removed two minor
  versions after this lands; by then all rolling upgrades will have
  completed.

EncryptionKey is wired into totp.Dependencies from cfg.JWTSecret so the
TOTP package does not need direct access to the broader Config struct.

Properties:
- Idempotent: encrypted rows are recognised by the enc:v1: prefix and
  never re-encrypted
- Safe under partial migration / mid-rollout failure: read path handles
  both forms; write path always produces the new form
- Mixed cluster during a rolling deploy: old nodes still read enc:v1:
  values successfully because they fail open in DecryptTOTPSecret only
  if the secret happens to be plaintext-shaped — actual mixed clusters
  will need both versions to ship together, but the read path is
  forwards-compatible
lakhansamani added a commit that referenced this pull request Apr 7, 2026
…migration (#7, #8)

#7 — OTPs stored in plaintext
schemas/otp.go.Otp held the raw 6-digit code. An offline DB dump revealed
every active password-reset / signup-verification / MFA code, allowing
account takeover for any user with an outstanding OTP request.

OTPs are short-lived (minutes) so a reversible primitive is unnecessary —
the verifier only needs to recompute the digest from the candidate. Add:
- crypto.HashOTP(plain, key)        — HMAC-SHA256 hex
- crypto.VerifyOTPHash(plain, stored, key) — constant-time compare

All write sites now hash before UpsertOTP. The plaintext stays in a local
variable for the email/SMS body. Affected handlers:
- internal/graphql/login.go (generateOTP closure: store hash, restore
  plaintext on the returned struct so callers can read otpData.Otp)
- internal/graphql/signup.go (mobile SMS verification)
- internal/graphql/forgot_password.go (mobile reset OTP)
- internal/graphql/resend_otp.go (same closure pattern as login)

Verify sites accept BOTH hash and legacy plaintext for the duration of a
rolling upgrade — old in-flight rows expire within minutes anyway:
- internal/graphql/verify_otp.go
- internal/graphql/reset_password.go

Migration: NONE required. All in-flight plaintext rows naturally expire
during the upgrade window.

#8 — TOTP secrets stored in plaintext
schemas/authenticators.go.Secret held the base32 TOTP shared secret. TOTP
secrets are long-lived (enrolled once, used forever) so a DB dump enables
permanent MFA bypass.

Add a versioned ciphertext format (so migration is idempotent and a future
algorithm rotation is straightforward):
- crypto.EncryptTOTPSecret(plain, key) → "enc:v1:" + AES-256-GCM ciphertext
- crypto.DecryptTOTPSecret(stored, key) — transparent legacy passthrough
- crypto.IsEncryptedTOTPSecret(stored) — prefix check used by lazy migration

The TOTP authenticator now:
- Encrypts the secret in Generate() before storing
- Decrypts in Validate() before computing the expected code
- Lazy-migrates legacy plaintext rows on the first successful Validate by
  re-encrypting and updating the row in place. The lazy approach avoids
  having to add a ListAuthenticators API to all 13 storage providers.
  DEPRECATED: the lazy migration code is intended to be removed two minor
  versions after this lands; by then all rolling upgrades will have
  completed.

EncryptionKey is wired into totp.Dependencies from cfg.JWTSecret so the
TOTP package does not need direct access to the broader Config struct.

Properties:
- Idempotent: encrypted rows are recognised by the enc:v1: prefix and
  never re-encrypted
- Safe under partial migration / mid-rollout failure: read path handles
  both forms; write path always produces the new form
- Mixed cluster during a rolling deploy: old nodes still read enc:v1:
  values successfully because they fail open in DecryptTOTPSecret only
  if the secret happens to be plaintext-shaped — actual mixed clusters
  will need both versions to ship together, but the read path is
  forwards-compatible
lakhansamani added a commit that referenced this pull request Apr 7, 2026
…migration (#7, #8) (#590)

* security: hash OTPs and encrypt TOTP secrets at rest with idempotent migration (#7, #8)

#7 — OTPs stored in plaintext
schemas/otp.go.Otp held the raw 6-digit code. An offline DB dump revealed
every active password-reset / signup-verification / MFA code, allowing
account takeover for any user with an outstanding OTP request.

OTPs are short-lived (minutes) so a reversible primitive is unnecessary —
the verifier only needs to recompute the digest from the candidate. Add:
- crypto.HashOTP(plain, key)        — HMAC-SHA256 hex
- crypto.VerifyOTPHash(plain, stored, key) — constant-time compare

All write sites now hash before UpsertOTP. The plaintext stays in a local
variable for the email/SMS body. Affected handlers:
- internal/graphql/login.go (generateOTP closure: store hash, restore
  plaintext on the returned struct so callers can read otpData.Otp)
- internal/graphql/signup.go (mobile SMS verification)
- internal/graphql/forgot_password.go (mobile reset OTP)
- internal/graphql/resend_otp.go (same closure pattern as login)

Verify sites accept BOTH hash and legacy plaintext for the duration of a
rolling upgrade — old in-flight rows expire within minutes anyway:
- internal/graphql/verify_otp.go
- internal/graphql/reset_password.go

Migration: NONE required. All in-flight plaintext rows naturally expire
during the upgrade window.

#8 — TOTP secrets stored in plaintext
schemas/authenticators.go.Secret held the base32 TOTP shared secret. TOTP
secrets are long-lived (enrolled once, used forever) so a DB dump enables
permanent MFA bypass.

Add a versioned ciphertext format (so migration is idempotent and a future
algorithm rotation is straightforward):
- crypto.EncryptTOTPSecret(plain, key) → "enc:v1:" + AES-256-GCM ciphertext
- crypto.DecryptTOTPSecret(stored, key) — transparent legacy passthrough
- crypto.IsEncryptedTOTPSecret(stored) — prefix check used by lazy migration

The TOTP authenticator now:
- Encrypts the secret in Generate() before storing
- Decrypts in Validate() before computing the expected code
- Lazy-migrates legacy plaintext rows on the first successful Validate by
  re-encrypting and updating the row in place. The lazy approach avoids
  having to add a ListAuthenticators API to all 13 storage providers.
  DEPRECATED: the lazy migration code is intended to be removed two minor
  versions after this lands; by then all rolling upgrades will have
  completed.

EncryptionKey is wired into totp.Dependencies from cfg.JWTSecret so the
TOTP package does not need direct access to the broader Config struct.

Properties:
- Idempotent: encrypted rows are recognised by the enc:v1: prefix and
  never re-encrypted
- Safe under partial migration / mid-rollout failure: read path handles
  both forms; write path always produces the new form
- Mixed cluster during a rolling deploy: old nodes still read enc:v1:
  values successfully because they fail open in DecryptTOTPSecret only
  if the secret happens to be plaintext-shaped — actual mixed clusters
  will need both versions to ship together, but the read path is
  forwards-compatible

* test: cover OTP-at-rest and TOTP lazy migration end-to-end

Adds dedicated unit and integration tests for the at-rest hardening so
each piece of the new behaviour is exercised explicitly:

internal/crypto/otp_test.go (new)
- HashOTP round-trip with the original plaintext
- VerifyOTPHash rejects: wrong plaintext, wrong server key, mismatched
  digest length
- VerifyOTPHash MUST reject the stored digest itself when used as input
  (otherwise the digest is a replayable credential)
- HashOTP is deterministic for a fixed (plaintext, key) pair

internal/crypto/totp_test.go (new)
- EncryptTOTPSecret produces enc:v1: ciphertext that decrypts to the
  original plaintext
- AES-GCM nonce randomness — same input yields different ciphertext
- DecryptTOTPSecret transparently passes through legacy plaintext rows
- DecryptTOTPSecret detects tampered ciphertext (GCM authentication)
- DecryptTOTPSecret fails on the wrong key
- EncryptTOTPSecret on empty input is a no-op
- IsEncryptedTOTPSecret prefix detection

internal/integration_tests/totp_at_rest_test.go (new)
- Generate() writes the secret as enc:v1:<ciphertext>, never as the raw
  base32 string. Decrypted value matches the plaintext returned to the
  caller for QR rendering.
- Validate() accepts a code computed from the encrypted secret.
- Validate() lazy-migrates a legacy plaintext row: a row written without
  the enc:v1: prefix still validates, AND the row is rewritten into the
  enc:v1: form on the first successful Validate.
- Validate() rejects an obviously wrong code on an encrypted row.
- Validate() is idempotent on already-encrypted rows: the underlying
  plaintext does not change across multiple validations.

internal/integration_tests/{verify_otp,resend_otp,reset_password}_test.go
- Reworked to seed a known plaintext / digest pair via UpsertOTP and
  verify with the plaintext, instead of reading the stored digest back
  out of the database and replaying it. The previous flow only passed
  because of the legacy plaintext fallback (see below).
- Added new "should fail when stored hash is replayed as input"
  subtest to lock the security guarantee in.

internal/integration_tests/test_helper.go
- Expose Config and AuthenticatorProvider on testSetup so tests can
  access cfg.JWTSecret and the totp authenticator. Config was already
  in the struct but never assigned by initTestSetup; this fixes that
  oversight.

internal/graphql/verify_otp.go and reset_password.go
- Drop the legacy plaintext fallback in the OTP comparison. The
  fallback was originally added to keep in-flight OTPs verifying during
  a rolling upgrade, but it ALSO accepted the stored digest itself as
  valid input — turning the digest into a replayable credential for
  anyone with DB read access. OTPs expire in minutes so the rolling-
  upgrade window is naturally bounded; users on either side simply
  request a fresh code.

* security: gate TOTP lazy migration behind --enable-totp-migration flag

The lazy migration that rewrites legacy plaintext TOTP rows into the
enc:v1: ciphertext format was always-on. This is unsafe for a rolling
deploy from a pre-encryption release: replicas still on the old version
do not understand the enc:v1: prefix and call totp.Validate(code, "enc:v1:...")
which treats the prefix as part of a base32 secret and fails to decode,
locking out users who happen to be routed to those replicas after a new
replica has migrated their row.

Changes:

1) New CLI flag --enable-totp-migration (Config.EnableTOTPMigration),
   default OFF. Plumbed through internal/authenticators.New() into
   totp.Dependencies.EnableLazyMigration. The READ path is unaffected:
   DecryptTOTPSecret transparently handles both legacy plaintext and
   enc:v1: rows regardless of the flag, so a fresh deploy never breaks
   in-flight users.

2) Recommended deploy sequence is documented inline on
   totp.Dependencies.EnableLazyMigration:
     a. Roll out this release to all replicas with the flag OFF.
     b. Confirm every replica is on the new version.
     c. Restart with --enable-totp-migration=true.
     d. After enough time for active users to validate at least once,
        the flag may be left on indefinitely (no-op for already-encrypted
        rows).

3) totp.Validate() is now best-effort on the post-validate row write.
   A failure to update the authenticator row no longer fails the login —
   the user supplied a valid TOTP code, refusing it because of a transient
   DB hiccup or a migration encrypt error would be a worse outcome than
   a delayed VerifiedAt or a delayed migration. Failures are logged with
   structured fields so operators can spot them.

4) Decrypt failure on read is logged loudly with a hint that the most
   likely cause is a --jwt-secret rotation since enrollment.

5) Concurrency: two replicas observing the same legacy row may both
   decrypt, re-encrypt, and write before either commits. The two writes
   carry semantically identical plaintext under different AES-GCM nonces;
   last writer wins, the row ends up in enc:v1: form, and the
   IsEncryptedTOTPSecret short-circuit prevents any further re-encryption.
   This is documented in the function comment.

Tests:

internal/integration_tests/totp_at_rest_test.go gains two subtests:
  - "Validate without migration leaves legacy plaintext rows unchanged"
    spins up a second test setup with EnableTOTPMigration=false and
    asserts that a legacy plaintext row still validates BUT is not
    rewritten.
  - "Validate succeeds even if the migration write fails (best-effort)"
    documents the contract that a successful Validate returns true
    regardless of the row update outcome.

The "Validate lazy-migrates a legacy plaintext row" subtest now runs
against a setup with EnableTOTPMigration=true, so both flag states are
covered end-to-end.

* security: simplify TOTP migration — drop the flag, branch on prefix in Validate

Replaces the --enable-totp-migration flag with an in-Validate
branch that handles both forms in one place. The migration is now
always-on and the storage helper is no longer permitted to silently
pass legacy values through.

Why
---
The flag-gated lazy migration was over-engineered for the typical
deploy model. It existed to protect rolling deploys across multiple
replicas, but the protection was conditional: operators had to remember
to flip the flag at the right time, and the implicit "DecryptTOTPSecret
returns plaintext for legacy rows" behaviour was easy to miss when
reading the code.

The new shape is closer to the obvious algorithm: decrypt if you can,
fall back to the raw value if you can't, migrate if the fallback
matched.

Changes
-------

internal/crypto/totp.go
- DecryptTOTPSecret is now strict: it returns the new sentinel
  ErrTOTPSecretNotEncrypted on a non-prefixed row instead of silently
  echoing the input. The "decrypt" function now does exactly one thing,
  which is much easier to reason about.
- Test updated to assert the sentinel error.

internal/authenticators/totp/totp.go::Validate
- Branches explicitly on the decrypt result. Three outcomes:
    decErr == nil                              → enc:v1: row, use plaintext
    errors.Is(decErr, ErrTOTPSecretNotEncrypted) → legacy row, use raw value, migrate
    other decErr                               → fail closed, log loudly
- Migration is always attempted on a successful legacy validation; the
  write is best-effort (failures logged, login still succeeds).
- The function comment now documents the rolling-deploy tradeoff
  explicitly so operators with multi-replica clusters can plan around
  it (atomic deploy preferred; brief maintenance window otherwise).

internal/authenticators/totp/provider.go
internal/authenticators/providers.go
internal/config/config.go
cmd/root.go
- Drop EnableLazyMigration / Config.EnableTOTPMigration / the
  --enable-totp-migration CLI flag. There is no operator-facing knob
  any more.

internal/integration_tests/totp_at_rest_test.go
- Drop the "without migration" subtest (the flag it tested no longer
  exists).
- The "lazy migration" subtest is unchanged structurally but now
  exercises the always-on path.

Concurrency, write semantics, and idempotency are unchanged from the
previous commit:
- Two replicas migrating the same row write semantically identical
  plaintexts under different AES-GCM nonces; last writer wins; the
  IsEncryptedTOTPSecret short-circuit prevents re-encryption.
- A failed migration write does NOT fail the login.
- A decrypt failure on a prefixed row (e.g. wrong --jwt-secret) is
  fail-closed.

Known tradeoff
--------------
For a rolling deploy from a pre-encryption release: as soon as one
replica handles a TOTP login, that user's row is in enc:v1: form. Any
old replica that subsequently reads the row will treat the prefix as
part of a base32 secret and fail. Either deploy atomically, or accept
a short window during which migrated users may be locked out of old
replicas (TOTP codes regenerate every 30s so the user-visible impact
is bounded).
lakhansamani added a commit that referenced this pull request Apr 8, 2026
Token-layer hardening on top of the phase 3 JWKS multi-key work.
Addresses the OIDC review findings listed below.

- C2: SignJWTToken now stamps the `kid` JOSE header with ClientID so
  verifiers (including /.well-known/jwks.json) can pick the right
  published JWK during manual key rotation. ParseJWTToken honors the
  header — primary vs secondary vs legacy try-both fallback — via the
  new parseJWTWithKidSelection helper.
  (internal/token/jwt.go SignJWTToken, parseJWTWithKidSelection)

- H1 / Logic #7: Secondary-key fallback no longer fires for every
  error. It is now gated on errors.Is(err, jwt.ErrTokenSignatureInvalid)
  so malformed tokens, alg mismatches, and claim errors short-circuit.
  Both the primary and combined errors are wrapped with %w so callers
  can use errors.Is (Logic #13). Successful secondary verification
  logs a single debug line ("token verified by secondary key") as a
  rotation signal — no token contents are logged.
  (internal/token/jwt.go parseJWTWithKidSelection lines ~174-200)

- Logic #4: `iat` is optional per RFC 7519 §4.1.6 and OIDC Core for
  non-self-issued tokens. ParseJWTToken no longer treats a missing iat
  as an error; present iat is still normalized to int64.
  (internal/token/jwt.go ParseJWTToken lines ~229-238)

- M4: `aud` comparison now accepts arrays. New helper
  token.AudienceMatches handles string, []string, []interface{}, and
  jwt.ClaimStrings (plus nil/empty). Used in both ValidateJWTClaims and
  ValidateJWTTokenWithoutNonce.
  (internal/token/audience.go, internal/token/jwt.go ValidateJWTClaims)

- M5: ValidateJWTClaims now requires sub == User.ID for OIDC-visible
  tokens (access, refresh, ID). The legacy email-as-sub fallback is
  restricted to verification token types (signup, magic-link,
  forgot-password, invite, OTP) where the user ID does not yet exist.
  OIDC Core §2 requires sub to never be reassigned.
  (internal/token/jwt.go ValidateJWTClaims, verificationTokenTypes)

- New helper ParseIDTokenHint: verifies the signature of an
  id_token_hint (OIDC Core §3.1.2.1, RP-Initiated Logout §2) without
  running the exp/nbf/iat validators, as the OP SHOULD accept expired
  hints. Reuses the kid-aware key selection from ParseJWTToken.
  (internal/token/id_token_hint.go)

- AMR / ACR / ResponseType constants: new internal/constants/amr.go
  and internal/constants/acr.go define the RFC 8176 amr values
  (AMRPassword, AMROTP, AMRFederated, AMRMFA) and the OIDC Core §2
  baseline acr value (ACRBaseline = "0"). Replaces the string literals
  in auth_token.go loginMethodToAMR (~L65-83) and CreateIDToken
  (~L478).

Tests: unit tests for AudienceMatches (4 shapes + nil/empty), kid
header stamping, legacy kid-less fallback, direct secondary-kid
selection, signature-only fallback (no retry on malformed), wrapped
error propagation via errors.Is, optional iat, strict sub for OIDC
tokens, email-as-sub allowed for verification tokens, ParseIDTokenHint
accept-expired / reject-bad-sig / reject-malformed / legacy fallback.

make test-sqlite: all packages PASS.
lakhansamani added a commit that referenced this pull request Apr 8, 2026
Token-layer hardening on top of the phase 3 JWKS multi-key work.
Addresses the OIDC review findings listed below.

- C2: SignJWTToken now stamps the `kid` JOSE header with ClientID so
  verifiers (including /.well-known/jwks.json) can pick the right
  published JWK during manual key rotation. ParseJWTToken honors the
  header — primary vs secondary vs legacy try-both fallback — via the
  new parseJWTWithKidSelection helper.
  (internal/token/jwt.go SignJWTToken, parseJWTWithKidSelection)

- H1 / Logic #7: Secondary-key fallback no longer fires for every
  error. It is now gated on errors.Is(err, jwt.ErrTokenSignatureInvalid)
  so malformed tokens, alg mismatches, and claim errors short-circuit.
  Both the primary and combined errors are wrapped with %w so callers
  can use errors.Is (Logic #13). Successful secondary verification
  logs a single debug line ("token verified by secondary key") as a
  rotation signal — no token contents are logged.
  (internal/token/jwt.go parseJWTWithKidSelection lines ~174-200)

- Logic #4: `iat` is optional per RFC 7519 §4.1.6 and OIDC Core for
  non-self-issued tokens. ParseJWTToken no longer treats a missing iat
  as an error; present iat is still normalized to int64.
  (internal/token/jwt.go ParseJWTToken lines ~229-238)

- M4: `aud` comparison now accepts arrays. New helper
  token.AudienceMatches handles string, []string, []interface{}, and
  jwt.ClaimStrings (plus nil/empty). Used in both ValidateJWTClaims and
  ValidateJWTTokenWithoutNonce.
  (internal/token/audience.go, internal/token/jwt.go ValidateJWTClaims)

- M5: ValidateJWTClaims now requires sub == User.ID for OIDC-visible
  tokens (access, refresh, ID). The legacy email-as-sub fallback is
  restricted to verification token types (signup, magic-link,
  forgot-password, invite, OTP) where the user ID does not yet exist.
  OIDC Core §2 requires sub to never be reassigned.
  (internal/token/jwt.go ValidateJWTClaims, verificationTokenTypes)

- New helper ParseIDTokenHint: verifies the signature of an
  id_token_hint (OIDC Core §3.1.2.1, RP-Initiated Logout §2) without
  running the exp/nbf/iat validators, as the OP SHOULD accept expired
  hints. Reuses the kid-aware key selection from ParseJWTToken.
  (internal/token/id_token_hint.go)

- AMR / ACR / ResponseType constants: new internal/constants/amr.go
  and internal/constants/acr.go define the RFC 8176 amr values
  (AMRPassword, AMROTP, AMRFederated, AMRMFA) and the OIDC Core §2
  baseline acr value (ACRBaseline = "0"). Replaces the string literals
  in auth_token.go loginMethodToAMR (~L65-83) and CreateIDToken
  (~L478).

Tests: unit tests for AudienceMatches (4 shapes + nil/empty), kid
header stamping, legacy kid-less fallback, direct secondary-kid
selection, signature-only fallback (no retry on malformed), wrapped
error propagation via errors.Is, optional iat, strict sub for OIDC
tokens, email-as-sub allowed for verification tokens, ParseIDTokenHint
accept-expired / reject-bad-sig / reject-malformed / legacy fallback.

make test-sqlite: all packages PASS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add verify signup token resolver

1 participant