app-server: persist device key bindings in sqlite#19206
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cba35b707f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let binding = self | ||
| .bindings | ||
| .get_binding(&request.key_id) | ||
| .await? | ||
| .ok_or(DeviceKeyError::KeyNotFound)?; |
There was a problem hiding this comment.
Add fallback for missing SQLite key bindings
DeviceKeyStore::sign now hard-fails when device_key_bindings has no row for key_id. Keys created before this migration had binding metadata only in providers, so after deploy they will be treated as missing and cannot sign. Add migration/backfill or a temporary legacy fallback path to avoid breaking existing enrolled keys.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
not applicable yet as no users yet
3c1f8d4 to
763ee60
Compare
etraut-openai
left a comment
There was a problem hiding this comment.
Looks mostly good to me. I found one small hole that you might want to fix — up to you.
| key_id_random: &key_id_random, | ||
| protection_policy: request.protection_policy, | ||
| binding: &request.binding, | ||
| let provider = Arc::clone(&self.provider); |
There was a problem hiding this comment.
create now creates the provider key first and writes the SQLite binding afterward. If put_binding fails after the provider succeeds, the API returns an error but leaves private key material behind with no binding record so future signing cannot reconcile or clean it up. This is an unlikely failure, so not sure how important it is. You could add a delete on failure.
| }) | ||
| }) | ||
| .await?; | ||
| self.bindings |
There was a problem hiding this comment.
Non-blocking: create() now creates provider key material before writing the SQLite binding. If the binding write fails after provider creation succeeds, this returns an error but leaves behind a stranded key with no binding row, so future signing cannot reconcile or use it. I do not think this is a security blocker, but it would be cleaner to add cleanup/delete-on-failure or otherwise make this failure mode recoverable.
## Why Device-key providers should only own platform key material. The account/client binding used to authorize a signing payload is app-server state, and keeping that state in provider-specific metadata makes the same check harder to audit and harder to share across platform implementations. Persisting the binding in the shared state database gives the device-key crate a platform-neutral source of truth before it asks a provider to sign. It also lets app-server move potentially blocking key operations off the main message processor path, which matters once providers may wait for OS authentication prompts. Because key creation now spans provider-owned key material and a sqlite binding row, create must also clean up after partial failure. If sqlite persistence fails after provider creation succeeds, the new key would otherwise be left behind without a usable binding. ## What changed - Add a device_key_bindings state migration plus StateRuntime helpers keyed by key_id. - Add an async DeviceKeyBindingStore abstraction to codex-device-key and use it from DeviceKeyStore::create and DeviceKeyStore::sign. - Delete newly created provider key material when the binding write fails, preserving the original binding-store error when cleanup succeeds. - Keep provider calls behind async store methods and run the synchronous provider work through spawn_blocking. - Wire app-server device-key RPC handling to the SQLite-backed binding store. - Route device-key RPC handling through a shared spawned helper so transport rejection, async store/provider calls, and response delivery do not block the message processor loop. ## Validation - just fmt - cargo test -p codex-device-key - cargo test -p codex-state device_key - cargo test -p codex-app-server device_key - just fix -p codex-device-key - just fix -p codex-state - just fix -p codex-app-server - git diff --check
763ee60 to
99d4f66
Compare
Why
Device-key providers should only own platform key material. The account/client binding used to authorize a signing payload is app-server state, and keeping that state in provider-specific metadata makes the same check harder to audit and harder to share across platform implementations.
Persisting the binding in the shared state database gives the device-key crate a platform-neutral source of truth before it asks a provider to sign. It also lets app-server move potentially blocking key operations off the main message processor path, which matters once providers may wait for OS authentication prompts.
What changed
device_key_bindingsstate migration plusStateRuntimehelpers keyed bykey_id.DeviceKeyBindingStoreabstraction tocodex-device-keyand use it fromDeviceKeyStore::createandDeviceKeyStore::sign.spawn_blocking.Validation
cargo test -p codex-device-keycargo test -p codex-state device_keycargo test -p codex-statecargo test -p codex-app-server device_keycargo test -p codex-app-server message_processor::tracing_tests::turn_start_jsonrpc_span_parents_core_turn_spanscargo test -p codex-app-serverjust fix -p codex-device-keyjust fix -p codex-statejust fix -p codex-app-serverjust bazel-lock-updatejust bazel-lock-checkgit diff --check