refactor(chat): execution graph optimize#873
Conversation
- Removed unnecessary status labels and hints from the execution graph. - Introduced a collapsed summary for tool calls and process messages. - Enhanced the handling of narration steps to prevent duplication in the chat stream. - Updated the `ExecutionGraphCard` component to manage expanded/collapsed states more effectively. - Added functionality to suppress intermediate pure-text assistant messages in the chat display.
…nagement - Added `suppressAssistantText` prop to `ChatMessage` to control visibility of assistant text bubbles during execution graph rendering. - Updated `ExecutionGraphCard` to support controlled expanded/collapsed states via `expanded` and `onExpandedChange` props. - Improved logic for determining final reply messages to include those with mixed text and tool calls, preventing orphan narration bubbles in the chat stream. - Introduced persistent state management for execution graph expansion across runs.
…n graph - Updated comments for clarity regarding the suppression of intermediate assistant messages in the chat stream. - Enhanced logic to ensure that both narration text and thinking outputs are properly managed and displayed in the execution graph. - Added functionality to include stream-time narration in the execution graph, preventing it from appearing in the assistant reply area. - Adjusted tests to reflect changes in message handling and ensure proper rendering of in-flight prompts.
…improvements - Removed unused imports and added an AnimatedDots component for visual feedback during tool execution. - Improved the layout and styling of the ExecutionGraphCard to better handle tool and narration steps. - Updated logic to conditionally display status indicators and detail previews based on step types. - Adjusted tests to verify the visibility of execution details and ensure proper interaction with the execution graph.
…proved state management and caching - Updated ExecutionGraphCard to ensure active runs remain expanded by default, improving user experience during live execution. - Refined logic for handling expanded/collapsed states based on user interactions and run completion. - Introduced caching for execution steps in the Chat component to optimize rendering and prevent unnecessary re-renders. - Enhanced tests to validate the visibility and state of the execution graph during various user interactions.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 64ac4e1f53
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (steps.length === 0) { | ||
| const cached = graphStepCache[runKey]; | ||
| if (!cached) return []; | ||
| return [{ |
There was a problem hiding this comment.
Avoid reusing cached graph for runs with no final process steps
This fallback resurrects graphStepCache whenever deriveTaskSteps returns no steps, which breaks completed runs that streamed only plain assistant text (no tools/thinking). In that path, the active run temporarily cached a stream-message step, but after the final assistant message is persisted the step list is intentionally empty; reusing cached steps makes the finished run keep an execution graph that duplicates the final answer text and shows a misleading process summary. This regression is specific to streamed direct-reply runs.
Useful? React with 👍 / 👎.
- Introduced a new caching mechanism for execution graph steps, allowing for better performance and state retention across sessions. - Replaced local state management with a centralized cache to streamline updates and reduce unnecessary re-renders. - Enhanced logic for managing streaming timestamps, ensuring accurate tracking during message sending. - Updated related effects to utilize the new caching structure, improving overall component efficiency.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 55068108b8
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // the graph to prevent duplication. When a run is still streaming, the | ||
| // reply lives in `streamingMessage`, so every pure-text assistant message in | ||
| // `messages` is treated as intermediate narration. | ||
| const replyIndex = findReplyMessageIndex(messages, streamMessage != null); |
There was a problem hiding this comment.
Treat pending-final runs as still streaming for reply detection
In deriveTaskSteps, the final-reply guard is keyed to streamMessage != null, so when a run is still active but pendingFinal has cleared streamingMessage, the latest assistant text in history is misclassified as the final reply and excluded from graph narration. At the same time, Chat/index.tsx folds active-run assistant narration out of the chat list, which means that message can disappear from both the bubble stream and the execution graph during the pending-final window (common after tool-result/finalization transitions). This should use an active-run signal (e.g. sending || pendingFinal) rather than stream-message presence.
Useful? React with 👍 / 👎.
… handling - Added `textOverride` prop to `ChatMessage` for custom message text rendering. - Introduced new utility functions to split and manage progressive message segments, improving clarity in execution steps. - Updated `deriveTaskSteps` to handle multiple thinking segments and narration more effectively, ensuring accurate representation in the execution graph. - Enhanced tests to validate the new message handling logic and ensure proper rendering of segmented thinking and narration details.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: da6354b724
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const trimmedReplyText = stripProcessMessagePrefix(streamText, getPrimaryMessageStepTexts(steps)); | ||
| const hasReplyText = trimmedReplyText.trim().length > 0; | ||
| if (hasReplyText || hasStreamImages) { |
There was a problem hiding this comment.
Require stripped suffix before marking a streaming final reply
In the pending-final path, this code treats any non-empty trimmedReplyText as a completed reply. If the stream frame is only a duplicate of already-folded process text (common with cumulative/duplicate deltas), stripProcessMessagePrefix can still leave trimmedReplyText non-empty and streamingReplyText gets set, which flips the run to inactive and collapses the execution graph before new final content actually arrives. That surfaces process narration as the final bubble and hides active-run state.
Useful? React with 👍 / 👎.
| const hasNarrationText = extractText(candidate).trim().length > 0; | ||
| const hasThinking = !!extractThinking(candidate); | ||
| if (!hasNarrationText && !hasThinking) continue; | ||
| foldedNarrationIndices.add(idx + 1 + offset); |
There was a problem hiding this comment.
Preserve folded assistant turns that carry visual/file outputs
This folding predicate hides every intermediate assistant message that has narration/thinking, even if that same message also contains images or attachments. In runs where the assistant emits an artifact plus narration before the final text, the whole message is dropped from the chat stream, and the execution graph does not render those artifacts, so user-visible outputs disappear from the transcript.
Useful? React with 👍 / 👎.
…d properties - Removed `sending` and `pendingFinal` properties from the Chat component and related functions to simplify state handling. - Updated the `deriveTaskSteps` function to reflect these changes, ensuring cleaner logic and improved performance. - Adjusted unit tests to remove references to the removed properties, maintaining test integrity.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8bc625fc83
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const graphNarrationSegments = messageIndex === replyIndex | ||
| ? narrationSegments.slice(0, -1) |
There was a problem hiding this comment.
Keep full assistant reply when text blocks are non-cumulative
This logic always drops the last text segment from the detected reply message and folds earlier segments into the execution graph. When an assistant reply contains multiple non-cumulative text blocks (for example chunked streaming frames or multi-block final content), valid answer content is misclassified as process narration; later prefix stripping removes it from the chat bubble, so users only see the tail of the reply instead of the full answer.
Useful? React with 👍 / 👎.
| for (const runKey of runKeysToCollapse) { | ||
| if (next[runKey] === false) continue; | ||
| next[runKey] = false; |
There was a problem hiding this comment.
Allow users to reopen auto-collapsed execution graphs
The auto-collapse effect rewrites the override to false whenever a run key is in runKeysToCollapse, and that condition remains true for runs with reply text overrides. If a user clicks a collapsed graph to expand it, the next effect pass forces it back to collapsed, so those runs cannot stay expanded after manual interaction.
Useful? React with 👍 / 👎.
…eedback - Introduced a new "thinking" label in English, Japanese, Russian, and Chinese locales to enhance clarity during execution. - Updated the ExecutionGraphCard to display the localized thinking label when the step status is "thinking". - Adjusted tests to verify the visibility and correct rendering of the thinking label in the execution graph.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e77b98781a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const hasNarrationText = extractText(candidate).trim().length > 0; | ||
| const hasThinking = !!extractThinking(candidate); | ||
| if (!hasNarrationText && !hasThinking) continue; | ||
| foldedNarrationIndices.add(idx + 1 + offset); |
There was a problem hiding this comment.
Fold only narration that is still present in the graph
This unconditionally folds every intermediate assistant turn with text/thinking out of the chat stream, but deriveTaskSteps later truncates graph steps to MAX_TASK_STEPS (8). In long runs, early narration messages get hidden here while their corresponding graph steps are already dropped, so users lose process output entirely (neither bubble nor graph shows it). The folding decision should be based on the retained card.steps (or avoid truncating folded narration).
Useful? React with 👍 / 👎.
- Removed the MAX_TASK_STEPS constant to allow all steps to be retained in the execution graph. - Updated the deriveTaskSteps function to return all steps without limiting their count. - Modified unit tests to reflect the change in behavior, ensuring all steps are accounted for in the execution graph. - Added a new test case to verify the visibility of long execution histories and the correct handling of reply prefixes.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4e51900cd2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| next[runKey] = { | ||
| steps: card.steps, | ||
| agentLabel: card.agentLabel, | ||
| sessionLabel: card.sessionLabel, | ||
| segmentEnd: card.segmentEnd, |
There was a problem hiding this comment.
Evict stale execution-graph cache entries
The cache write path stores every run's full steps payload in the module-level graphStepCacheStore, but there is no pruning when runs disappear from history or sessions are closed. Because this refactor also removed the old step cap, long-lived app sessions can retain large historical step arrays indefinitely, causing avoidable memory growth and slower renders over time when users browse many sessions/runs.
Useful? React with 👍 / 👎.
Derive autoCollapsedRunKeys during render with useMemo instead of writing to state from useEffect, which violated react-hooks/set-state-in-effect and broke the lint CI check. Explicit user toggles (stored in graphExpandedOverrides) still win; the derived auto-collapse is only used when the user has not toggled the graph. Co-authored-by: Haze <hazeone@users.noreply.github.com>
This reverts commit 1b2dcce.
… dedupe (ValueCell-ai#821 ValueCell-ai#845 ValueCell-ai#870 ValueCell-ai#873 ValueCell-ai#875 ValueCell-ai#878 ValueCell-ai#880 ValueCell-ai#885 ValueCell-ai#886 ValueCell-ai#887 ValueCell-ai#891 ValueCell-ai#903) Overhaul execution graph card (collapse/expand, narration steps, web_fetch links), separate thinking messages, render LaTeX math, dedupe optimistic messages, hide recoverable gateway timeouts, add startup history recovery.
Summary
Type of Change
Checklist