🐛 fix: first inject the cloudecc runtime session should use the existingStatus#14592
Conversation
…ction Race condition on new-topic first message: 1. switchTopic loads runningOperation → useGatewayReconnect fires 2. executeGatewayAgent calls connectToGateway (status: connecting) 3. reconnectToGatewayOperation overwrites with resumeOnConnect:true 4. Gateway sees resume on a brand-new session → no events → stuck Second message works because the client store's runningOperation is stale (from the first op), so SWR deduplications and no reconnect fires. Fix: bail out of reconnectToGatewayOperation if gatewayConnections already shows connecting/connected for that operationId. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sume CC stores session files at ~/.claude/projects/<encoded-cwd>/. Without an explicit --cwd the actual working directory can differ between sandbox invocations, so --resume <heteroSessionId> fails to locate the previous session files even though the container is persistent and the ID is correctly stored in topic.metadata. Default cwd to /workspace for cloud runs (desktop keeps its own explicit path), guaranteeing a stable session-file location across page reloads within the same sandbox lifecycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Sorry @ONLY-yours, you have reached your weekly rate limit of 500000 diff characters.
Please try again later or upgrade to continue using Sourcery
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 644d1b6788
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #14592 +/- ##
===========================================
- Coverage 81.78% 68.91% -12.87%
===========================================
Files 671 2637 +1966
Lines 44839 232546 +187707
Branches 6632 29627 +22995
===========================================
+ Hits 36670 160257 +123587
- Misses 8019 72139 +64120
Partials 150 150
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
The previous guard only skipped reconnect for 'connecting'/'connected' but the connection can already be in 'authenticating' or 'reconnecting' by the time useGatewayReconnect fires, leaving the race window open. Flip the condition: skip for any status that is not 'disconnected'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Vercel serverless functions are stateless per-request, so `operationStates` is empty on every `heteroIngest` call. loadOrCreateState always cold-creates. #14539 fixed `toolMsgIdByCallId` restoration but left `accumulatedContent`, `toolState.payloads`, and `toolState.persistedIds` empty on cold load, causing two bugs: - Content truncation: cold instance starts with `accumulatedContent=''`, accumulates only the current batch's text, then writes that shorter string on the next step boundary or terminal — overwriting the longer content the previous write had already stored in DB. - Tool duplication / tools[] overwrite: `persistedIds={}` on cold load means every `tools_calling` event re-creates already-persisted tool messages, and `payloads=[]` means phase 1/3 writes only the current batch's tools, wiping previous tools from `assistant.tools[]`. Fix: in `loadOrCreateState`, fetch the current assistant message and restore `accumulatedContent`, `accumulatedReasoning`, `toolState.payloads`, and `toolState.persistedIds` from it. Cold load is now equivalent to warm load. Also adds two regression tests covering the cold-replica scenarios. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ingStatus (lobehub#14592) * 🐛 fix: skip reconnect when gateway action already established a connection Race condition on new-topic first message: 1. switchTopic loads runningOperation → useGatewayReconnect fires 2. executeGatewayAgent calls connectToGateway (status: connecting) 3. reconnectToGatewayOperation overwrites with resumeOnConnect:true 4. Gateway sees resume on a brand-new session → no events → stuck Second message works because the client store's runningOperation is stale (from the first op), so SWR deduplications and no reconnect fires. Fix: bail out of reconnectToGatewayOperation if gatewayConnections already shows connecting/connected for that operationId. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: always pass --cwd /workspace for cloud CC to ensure session resume CC stores session files at ~/.claude/projects/<encoded-cwd>/. Without an explicit --cwd the actual working directory can differ between sandbox invocations, so --resume <heteroSessionId> fails to locate the previous session files even though the container is persistent and the ID is correctly stored in topic.metadata. Default cwd to /workspace for cloud runs (desktop keeps its own explicit path), guaranteeing a stable session-file location across page reloads within the same sandbox lifecycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: extend reconnect guard to cover all in-flight connection statuses The previous guard only skipped reconnect for 'connecting'/'connected' but the connection can already be in 'authenticating' or 'reconnecting' by the time useGatewayReconnect fires, leaving the race window open. Flip the condition: skip for any status that is not 'disconnected'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: restore cold replica state in HeterogeneousPersistenceHandler Vercel serverless functions are stateless per-request, so `operationStates` is empty on every `heteroIngest` call. loadOrCreateState always cold-creates. lobehub#14539 fixed `toolMsgIdByCallId` restoration but left `accumulatedContent`, `toolState.payloads`, and `toolState.persistedIds` empty on cold load, causing two bugs: - Content truncation: cold instance starts with `accumulatedContent=''`, accumulates only the current batch's text, then writes that shorter string on the next step boundary or terminal — overwriting the longer content the previous write had already stored in DB. - Tool duplication / tools[] overwrite: `persistedIds={}` on cold load means every `tools_calling` event re-creates already-persisted tool messages, and `payloads=[]` means phase 1/3 writes only the current batch's tools, wiping previous tools from `assistant.tools[]`. Fix: in `loadOrCreateState`, fetch the current assistant message and restore `accumulatedContent`, `accumulatedReasoning`, `toolState.payloads`, and `toolState.persistedIds` from it. Cold load is now equivalent to warm load. Also adds two regression tests covering the cold-replica scenarios. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ingStatus (#14592) * 🐛 fix: skip reconnect when gateway action already established a connection Race condition on new-topic first message: 1. switchTopic loads runningOperation → useGatewayReconnect fires 2. executeGatewayAgent calls connectToGateway (status: connecting) 3. reconnectToGatewayOperation overwrites with resumeOnConnect:true 4. Gateway sees resume on a brand-new session → no events → stuck Second message works because the client store's runningOperation is stale (from the first op), so SWR deduplications and no reconnect fires. Fix: bail out of reconnectToGatewayOperation if gatewayConnections already shows connecting/connected for that operationId. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: always pass --cwd /workspace for cloud CC to ensure session resume CC stores session files at ~/.claude/projects/<encoded-cwd>/. Without an explicit --cwd the actual working directory can differ between sandbox invocations, so --resume <heteroSessionId> fails to locate the previous session files even though the container is persistent and the ID is correctly stored in topic.metadata. Default cwd to /workspace for cloud runs (desktop keeps its own explicit path), guaranteeing a stable session-file location across page reloads within the same sandbox lifecycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: extend reconnect guard to cover all in-flight connection statuses The previous guard only skipped reconnect for 'connecting'/'connected' but the connection can already be in 'authenticating' or 'reconnecting' by the time useGatewayReconnect fires, leaving the race window open. Flip the condition: skip for any status that is not 'disconnected'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: restore cold replica state in HeterogeneousPersistenceHandler Vercel serverless functions are stateless per-request, so `operationStates` is empty on every `heteroIngest` call. loadOrCreateState always cold-creates. #14539 fixed `toolMsgIdByCallId` restoration but left `accumulatedContent`, `toolState.payloads`, and `toolState.persistedIds` empty on cold load, causing two bugs: - Content truncation: cold instance starts with `accumulatedContent=''`, accumulates only the current batch's text, then writes that shorter string on the next step boundary or terminal — overwriting the longer content the previous write had already stored in DB. - Tool duplication / tools[] overwrite: `persistedIds={}` on cold load means every `tools_calling` event re-creates already-persisted tool messages, and `payloads=[]` means phase 1/3 writes only the current batch's tools, wiping previous tools from `assistant.tools[]`. Fix: in `loadOrCreateState`, fetch the current assistant message and restore `accumulatedContent`, `accumulatedReasoning`, `toolState.payloads`, and `toolState.persistedIds` from it. Cold load is now equivalent to warm load. Also adds two regression tests covering the cold-replica scenarios. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ingStatus (lobehub#14592) * 🐛 fix: skip reconnect when gateway action already established a connection Race condition on new-topic first message: 1. switchTopic loads runningOperation → useGatewayReconnect fires 2. executeGatewayAgent calls connectToGateway (status: connecting) 3. reconnectToGatewayOperation overwrites with resumeOnConnect:true 4. Gateway sees resume on a brand-new session → no events → stuck Second message works because the client store's runningOperation is stale (from the first op), so SWR deduplications and no reconnect fires. Fix: bail out of reconnectToGatewayOperation if gatewayConnections already shows connecting/connected for that operationId. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: always pass --cwd /workspace for cloud CC to ensure session resume CC stores session files at ~/.claude/projects/<encoded-cwd>/. Without an explicit --cwd the actual working directory can differ between sandbox invocations, so --resume <heteroSessionId> fails to locate the previous session files even though the container is persistent and the ID is correctly stored in topic.metadata. Default cwd to /workspace for cloud runs (desktop keeps its own explicit path), guaranteeing a stable session-file location across page reloads within the same sandbox lifecycle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: extend reconnect guard to cover all in-flight connection statuses The previous guard only skipped reconnect for 'connecting'/'connected' but the connection can already be in 'authenticating' or 'reconnecting' by the time useGatewayReconnect fires, leaving the race window open. Flip the condition: skip for any status that is not 'disconnected'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * 🐛 fix: restore cold replica state in HeterogeneousPersistenceHandler Vercel serverless functions are stateless per-request, so `operationStates` is empty on every `heteroIngest` call. loadOrCreateState always cold-creates. lobehub#14539 fixed `toolMsgIdByCallId` restoration but left `accumulatedContent`, `toolState.payloads`, and `toolState.persistedIds` empty on cold load, causing two bugs: - Content truncation: cold instance starts with `accumulatedContent=''`, accumulates only the current batch's text, then writes that shorter string on the next step boundary or terminal — overwriting the longer content the previous write had already stored in DB. - Tool duplication / tools[] overwrite: `persistedIds={}` on cold load means every `tools_calling` event re-creates already-persisted tool messages, and `payloads=[]` means phase 1/3 writes only the current batch's tools, wiping previous tools from `assistant.tools[]`. Fix: in `loadOrCreateState`, fetch the current assistant message and restore `accumulatedContent`, `accumulatedReasoning`, `toolState.payloads`, and `toolState.persistedIds` from it. Cold load is now equivalent to warm load. Also adds two regression tests covering the cold-replica scenarios. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
💻 Change Type
🔗 Related Issue
🔀 Description of Change
🧪 How to Test
📸 Screenshots / Videos
📝 Additional Information