Skip to content

feat: add edit-and-retry DAG runs#2010

Merged
yohamta0 merged 18 commits into
mainfrom
feat/edit-retry-dag-run
Apr 19, 2026
Merged

feat: add edit-and-retry DAG runs#2010
yohamta0 merged 18 commits into
mainfrom
feat/edit-retry-dag-run

Conversation

@yohamta0

@yohamta0 yohamta0 commented Apr 19, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • add edit-and-retry DAG-run API endpoints that validate edited specs, preview/create retry runs, and preserve outputs for selected skipped successful steps
  • mark retry-skipped nodes so downstream dependencies can continue while ordinary skipped dependencies keep existing behavior
  • add web UI support to edit a stored DAG-run spec, review the server preview with graph/step controls, confirm before creating the retry run, and save source DAG changes through the existing spec update API after a diff confirmation
  • clear transient run source/working-directory metadata for edited retry runs and harden the flaky Windows agent HTTP cancel test

Testing

  • make api
  • cd ui && pnpm gen:api
  • make fmt
  • make lint
  • go test ./internal/service/frontend/api/v1 ./internal/runtime -count=1
  • go test ./internal/intg -run 'TestAPIEditRetrySkipsCompletedStepsAndPreservesOutputs|TestWorkingDirectoryResolution|TestParallelExecution_OutputCaptureWithRetry|TestEnv/DAGRunWorkDirWithExplicitWorkingDir' -count=1
  • go test ./internal/runtime/agent -count=1
  • cd ui && pnpm exec vitest run src/features/dags/components/dag-editor/tests/DAGSpecReadOnly.test.tsx
  • cd ui && pnpm typecheck
  • cd ui && pnpm build
  • GitHub Actions checks passed on PR feat: add edit-and-retry DAG runs #2010 at head 5185027

@coderabbitai

coderabbitai Bot commented Apr 19, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 67193b8e-0226-4cde-9752-be5c77234656

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces an "edit-retry" feature enabling users to modify a DAG run specification and selectively retry it with completed steps either skipped or re-executed. It includes new API endpoints for preview and execution, data model extensions to track skipped nodes, runtime logic to permit downstream dependencies of skipped nodes, frontend UI for editing and launching retries, and comprehensive tests.

Changes

Cohort / File(s) Summary
API Specification
api/v1/api.yaml, ui/src/api/v1/schema.ts
Added two new POST endpoints: edit-retry/preview for validating edited specs and previewing skip/runnable steps, and edit-retry for executing a retry with optional step selection and persistence options.
Data Model Extensions
internal/core/exec/node.go, internal/runtime/data.go, internal/runtime/transform/node.go
Added SkippedByRetry boolean field to node state structures across persistence, runtime, and transformation layers to explicitly track nodes skipped during edit-retry.
Runtime Execution Logic
internal/runtime/runner.go, internal/runtime/runner_test.go
Updated dependency-readiness logic in isReady to allow downstream nodes to execute even when upstream dependencies are marked NodeSkipped with SkippedByRetry=true; added test cases validating skip-by-retry vs. ordinary skip behaviors.
Edit-Retry API Implementation
internal/service/frontend/api/v1/dagruns_edit_retry.go, internal/service/frontend/api/v1/dagruns_edit_retry_internal_test.go
Comprehensive implementation of preview and execution handlers with planning logic, step eligibility validation, DAG loading, node seeding with output preservation, enqueuing/dispatching, and audit logging; includes unit tests for preview, execution with skip selections, and validation error cases.
Integration Tests
internal/intg/edit_retry_api_test.go
End-to-end integration test verifying edit-retry workflow: DAG execution, failure, spec editing, preview validation, retry execution with step skipping, output preservation from skipped nodes, and final successful completion.
Frontend UI
ui/src/features/dag-runs/components/common/DAGRunActions.tsx
Added edit-retry modal flow with YAML spec editor, preview button, selectable skip-steps UI, persistence toggle, and branched submission logic; integrates preview and execute endpoints with state management for edited specs and preview results.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant UI as Frontend UI
    participant API as API Server
    participant Store as DAG Run Store
    participant Runtime as Runtime Executor
    
    User->>UI: Load DAG run details
    UI->>API: GET /dag-runs/{name}/{dagRunId}/spec
    API->>Store: Retrieve stored DAG spec
    Store-->>API: DAG spec + outputs
    API-->>UI: Prefill editor with spec
    
    User->>UI: Edit spec + click Preview
    UI->>API: POST /dag-runs/{name}/{dagRunId}/edit-retry/preview
    API->>Store: Load source DAG-run attempt status
    API->>API: Validate edited spec, resolve DAG
    API->>API: Plan eligible steps & compute defaults
    API-->>UI: skippedSteps, runnableSteps, errors, warnings
    
    User->>UI: Review preview, optionally select skip-steps
    User->>UI: Click Execute
    UI->>API: POST /dag-runs/{name}/{dagRunId}/edit-retry
    API->>Store: Create new attempt with edited DAG
    API->>API: Seed nodes: mark prior steps as skipped
    API->>API: Copy output variables to skipped nodes
    API->>API: Enqueue/dispatch to runtime
    API-->>UI: New dagRunId, queued status
    
    API->>Runtime: Launch retry with seeded nodes
    Runtime->>Runtime: Process DAG: skip marked nodes (use outputs)
    Runtime->>Runtime: Execute runnable steps
    Runtime->>Store: Update node statuses & outputs
Loading
sequenceDiagram
    participant Store as DAG Run Store
    participant Validator as Edit-Retry Validator
    participant Seeder as Node Seeder
    participant Queue as Task Queue/Coordinator
    
    Validator->>Store: Load source DAG-run attempt
    Validator->>Validator: Check source status (must be terminal)
    Validator->>Validator: Validate spec, DAG name, persistSpec constraints
    Validator->>Validator: Load edited DAG, resolve params
    Validator->>Validator: Compute eligible steps from source outputs
    
    Validator->>Validator: Plan skipSteps (explicit or default)
    Validator->>Validator: Validate skipSteps: no unknowns, only eligible
    
    Seeder->>Store: Create new attempt for retry
    Seeder->>Seeder: Mark prior steps as NodeSkipped+SkippedByRetry
    Seeder->>Seeder: Copy output variables to skipped nodes
    Seeder->>Store: Persist seeded node states & attempt
    
    Seeder->>Queue: Enqueue/dispatch retry run
    Queue-->>Seeder: Confirmation
    Seeder->>Store: Update attempt to Queued (atomic)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.82% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add edit-and-retry DAG runs' directly and accurately summarizes the main feature addition across all changed files—new API endpoints and UI for editing and retrying DAG runs with preserved outputs.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/edit-retry-dag-run

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (4)
ui/src/features/dag-runs/components/common/DAGRunActions.tsx (2)

604-619: Toggling Reschedule leaves stale Edit-Retry state in memory.

When the user enables retryAsNew, you disable editRetry but do not clear editRetrySpec, editRetryPreview, editRetrySkipSteps, or editRetryPersistSpec. The UI hides those panels so it isn't user-visible, but if the user toggles back into edit-retry, loadEditRetrySpec will overwrite some state while preserving e.g. the previously-selected editRetrySkipSteps until a new preview lands, which can be confusing. Mirror the cleanup you already do in the edit-retry checkbox's else branch (Line 631-636) here for symmetry.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/features/dag-runs/components/common/DAGRunActions.tsx` around lines
604 - 619, When enabling the "reschedule" Checkbox (the onCheckedChange handler
for id "reschedule" in DAGRunActions), you currently set setRetryAsNew(true) and
setEditRetry(false) but leave stale edit-retry state in memory; update that
handler to also clear the edit-retry state by setting editRetrySpec,
editRetryPreview, editRetrySkipSteps, and editRetryPersistSpec to their
initial/empty values (same cleanup performed in the edit-retry checkbox's else
branch) so toggling into reschedule fully resets the edit-retry state.

770-780: Use the Monaco YAML editor for spec editing.

Per the frontend library guidelines, YAML editing should go through the Monaco editor rather than a plain <Textarea>. The existing DAG edit flow already uses Monaco and benefits from syntax highlighting, validation affordances, and consistent UX; swapping to it here will also make multi-line YAML far easier to author than in a textarea with a small min-height.

As per coding guidelines: "Use React 19 with TypeScript, Tailwind CSS 4, Radix UI/shadcn components, Monaco editor for YAML, xterm.js for terminal, and SWR for data fetching in frontend code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/features/dag-runs/components/common/DAGRunActions.tsx` around lines
770 - 780, Replace the plain Textarea used for edited DAG spec with the Monaco
YAML editor component used elsewhere: locate the block rendering the Textarea
with id "edit-retry-spec" and state variables editRetrySpec and
setEditRetrySpec, remove the <Textarea> and swap in the project's Monaco wrapper
(e.g., MonacoEditor or YamlMonacoEditor) configured for language="yaml",
value={editRetrySpec} and onChange to call setEditRetrySpec(newValue); preserve
existing props like id and className (min-h-56, resize-y, font-mono, text-xs)
and ensure the Monaco component exposes onChange that supplies the new string so
validation/highlighting behaves the same as the other DAG editing flow.
internal/intg/edit_retry_api_test.go (1)

108-114: Potential race: ProcessQueuedInlineRun followed by polling for core.Succeeded assumes the edit-retry status transitions through queued→succeeded quickly.

waitForEditRetryStoredStatus is polled for up to intgTestTimeout(15*time.Second), which should be fine, but note the source run at Line 60 waits for core.Failed and the retry at Line 109 waits for core.Succeeded, while ProcessQueuedInlineRun only drains the queue processor once. On slower CI (Windows especially) the retry dispatch may still be in-flight. If you see flakiness here, consider asserting status.Status.IsTerminal() and then the expected status separately, or looping the queue processor similarly to the helper in internal/test/reschedule_inline_helpers.go.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/intg/edit_retry_api_test.go` around lines 108 - 114, The test may
race because calling ProcessQueuedInlineRun once then polling
waitForEditRetryStoredStatus(retryRunID, core.Succeeded) can observe the retry
while it's still in-flight on slow CI; update the test around
ProcessQueuedInlineRun and waitForEditRetryStoredStatus to first poll for a
terminal state using waitForEditRetryStoredStatus(...,
status.Status.IsTerminal()) or assert status.Status.IsTerminal() before
asserting specific terminal value, or instead drain the queue processor
repeatedly like the helper in internal/test/reschedule_inline_helpers.go so the
retry dispatch completes deterministically; adjust assertions on
retryStatus.Nodes (e.g., Nodes[0].Status, Nodes[0].SkippedByRetry,
Nodes[1].Status, Nodes[2].Status) after confirming terminal state.
internal/service/frontend/api/v1/dagruns_edit_retry.go (1)

450-465: Output-eligibility check treats non-string outputs as ineligible, but source Store values are typically strings from shell execution.

missingEditedRetryOutputReason rejects reuse if OutputVariables.Load returns a non-string, with reason "previous output %q is not a string". Given OutputVariables values are populated from command output (strings), this path is mostly defensive — but any caller that stored a non-string (e.g., a struct from future executor types) would silently be reported as ineligible with a confusing message instead of being accepted as-is.

Since downstream consumers treat outputs as strings, consider either coercing to string (fmt.Sprint(raw)) or dropping the type assertion and accepting any value; otherwise document the contract explicitly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/service/frontend/api/v1/dagruns_edit_retry.go` around lines 450 -
465, The function missingEditedRetryOutputReason currently treats any non-string
stored in node.OutputVariables as ineligible; instead, accept non-string values
by coercing them to string for downstream consumers. Replace the type assertion
branch that returns "previous output %q is not a string" with logic that treats
a loaded raw value as available (e.g., use fmt.Sprint(raw) when the caller needs
a string) so OutputVariables.Load(editedStep.Output) no longer causes a
rejection for non-string types; update the function to only return an error when
the key is missing or OutputVariables is nil, and remove the strict raw.(string)
assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/v1/api.yaml`:
- Around line 2247-2250: The dagRunId property currently references the reusable
DAGRunId schema which permits the special alias "latest" (causing ambiguity for
a newly-created retry run); change dagRunId to use a concrete schema that
disallows the "latest" alias (either by referencing a new schema like
DAGRunCreateId or replacing the allOf with an inline string schema/pattern that
excludes "latest"). Update the OpenAPI definition where dagRunId appears so the
new retry-run contract accepts only explicit IDs (no "latest") and add/adjust
the new schema definition (e.g., DAGRunCreateId) to mirror DAGRunId format but
without the "latest" alternative.

In `@internal/intg/edit_retry_api_test.go`:
- Around line 49-57: The test uses new(sourceRunID) and new(retryRunID)
incorrectly (new expects a type); update the request bodies so DagRunId receives
pointers to the string variables by replacing new(sourceRunID) with &sourceRunID
and new(retryRunID) with &retryRunID where
api.ExecuteDAGJSONRequestBody{DagRunId: ...} is constructed (refer to the
sourceRunID and retryRunID variables and the DagRunId field).

In `@internal/service/frontend/api/v1/dagruns_edit_retry.go`:
- Around line 538-589: seedEditRetryAttempt creates an attempt via
dagRunStore.CreateAttempt but never rolls it back on later errors, leaving
orphaned attempts; update seedEditRetryAttempt to introduce a committed bool
after CreateAttempt, add a deferred closure that calls
a.dagRunStore.RollbackAttempt(ctx, attempt) when committed is false (logging any
rollback error), and set committed = true only after successful attempt.Open and
attempt.Write complete so failures before commit release the run ID; reference
CreateAttempt, RollbackAttempt, seedEditRetryAttempt, attempt.Open, and
attempt.Write when making the change.
- Around line 717-726: The function editRetryWarnings currently emits "no
previously successful steps are eligible to skip" whenever skipped is empty;
change it to only emit that message when skipped is empty AND there was actually
something to skip or the caller explicitly requested skipping — e.g., add a
boolean parameter skipRequested (or an int successfulSourceCount) to
editRetryWarnings and update all callers, then change the condition to: if
len(skipped) == 0 && (skipRequested || len(ineligible) > 0) { append the "no
previously successful steps are eligible to skip" warning }; keep the existing
ineligible branch unchanged so "some previously successful steps cannot be
skipped..." still appears when applicable.
- Around line 144-153: The code currently calls a.dagStore.UpdateSpec (guarded
by opts.persistSpec) before launching the retry, which can mutate stored DAGs
even if launchEditRetryDAGRun fails; move the persistence so it runs only after
a successful launch: call a.launchEditRetryDAGRun(ctx, plan) first, check err
(and queued if needed), then if opts.persistSpec call a.dagStore.UpdateSpec(ctx,
request.Name, []byte(opts.specContent)) and handle any UpdateSpec error (return
or wrap as appropriate). Use the existing symbols (opts.persistSpec,
a.dagStore.UpdateSpec, a.launchEditRetryDAGRun, queued) to locate and reorder
the operations.
- Around line 247-259: The current buildEditRetryPlan calls
validateDAGRunID(opts.newDAGRunID) even when opts.newDAGRunID is empty (preview
path) which is redundant; update buildEditRetryPlan to skip calling
validateDAGRunID when opts.newDAGRunID == "" (only perform the validation when a
non-empty newDAGRunID is provided), leaving the execute-path logic that
regenerates and revalidates the ID (the block that handles empty newDAGRunID
later) unchanged; reference validateDAGRunID, buildEditRetryPlan, and
opts.newDAGRunID to locate the change.

In `@ui/src/features/dag-runs/components/common/DAGRunActions.tsx`:
- Around line 737-750: The DAG Name Override input (controlled by
dagNameOverride and setDagNameOverride) must be disabled and cleared when
editRetryPersistSpec is true to avoid backend 400s; update the Input with
disabled={editRetryPersistSpec} and onChange behavior so toggling
editRetryPersistSpec to true calls setDagNameOverride('') to clear it, and
ensure the request-building code that sends the preview / edit-retry payload
(where dagNameOverride is included) omits dagName when editRetryPersistSpec ===
true (i.e., only include dagName in the body if !editRetryPersistSpec &&
dagNameOverride is non-empty).
- Around line 132-229: Both previewEditRetrySpec and loadEditRetrySpec currently
use try/finally but lack a catch block, so thrown rejections from
client.POST/GET become unhandled; add a catch in each function around the await
client calls that calls showError with a helpful message (e.g., include
error.message or a default) and then rethrows or returns as appropriate, keeping
the existing finally block to ensure setEditRetryLoading(false) still runs;
update previewEditRetrySpec and loadEditRetrySpec to handle network/abort
exceptions via try/catch/finally so errors surface to the user.

---

Nitpick comments:
In `@internal/intg/edit_retry_api_test.go`:
- Around line 108-114: The test may race because calling ProcessQueuedInlineRun
once then polling waitForEditRetryStoredStatus(retryRunID, core.Succeeded) can
observe the retry while it's still in-flight on slow CI; update the test around
ProcessQueuedInlineRun and waitForEditRetryStoredStatus to first poll for a
terminal state using waitForEditRetryStoredStatus(...,
status.Status.IsTerminal()) or assert status.Status.IsTerminal() before
asserting specific terminal value, or instead drain the queue processor
repeatedly like the helper in internal/test/reschedule_inline_helpers.go so the
retry dispatch completes deterministically; adjust assertions on
retryStatus.Nodes (e.g., Nodes[0].Status, Nodes[0].SkippedByRetry,
Nodes[1].Status, Nodes[2].Status) after confirming terminal state.

In `@internal/service/frontend/api/v1/dagruns_edit_retry.go`:
- Around line 450-465: The function missingEditedRetryOutputReason currently
treats any non-string stored in node.OutputVariables as ineligible; instead,
accept non-string values by coercing them to string for downstream consumers.
Replace the type assertion branch that returns "previous output %q is not a
string" with logic that treats a loaded raw value as available (e.g., use
fmt.Sprint(raw) when the caller needs a string) so
OutputVariables.Load(editedStep.Output) no longer causes a rejection for
non-string types; update the function to only return an error when the key is
missing or OutputVariables is nil, and remove the strict raw.(string) assertion.

In `@ui/src/features/dag-runs/components/common/DAGRunActions.tsx`:
- Around line 604-619: When enabling the "reschedule" Checkbox (the
onCheckedChange handler for id "reschedule" in DAGRunActions), you currently set
setRetryAsNew(true) and setEditRetry(false) but leave stale edit-retry state in
memory; update that handler to also clear the edit-retry state by setting
editRetrySpec, editRetryPreview, editRetrySkipSteps, and editRetryPersistSpec to
their initial/empty values (same cleanup performed in the edit-retry checkbox's
else branch) so toggling into reschedule fully resets the edit-retry state.
- Around line 770-780: Replace the plain Textarea used for edited DAG spec with
the Monaco YAML editor component used elsewhere: locate the block rendering the
Textarea with id "edit-retry-spec" and state variables editRetrySpec and
setEditRetrySpec, remove the <Textarea> and swap in the project's Monaco wrapper
(e.g., MonacoEditor or YamlMonacoEditor) configured for language="yaml",
value={editRetrySpec} and onChange to call setEditRetrySpec(newValue); preserve
existing props like id and className (min-h-56, resize-y, font-mono, text-xs)
and ensure the Monaco component exposes onChange that supplies the new string so
validation/highlighting behaves the same as the other DAG editing flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3e2e319c-0bfd-4f1d-9e25-ecd7175846b9

📥 Commits

Reviewing files that changed from the base of the PR and between b22c3e6 and 3874bcb.

📒 Files selected for processing (12)
  • api/v1/api.gen.go
  • api/v1/api.yaml
  • internal/core/exec/node.go
  • internal/intg/edit_retry_api_test.go
  • internal/runtime/data.go
  • internal/runtime/runner.go
  • internal/runtime/runner_test.go
  • internal/runtime/transform/node.go
  • internal/service/frontend/api/v1/dagruns_edit_retry.go
  • internal/service/frontend/api/v1/dagruns_edit_retry_internal_test.go
  • ui/src/api/v1/schema.ts
  • ui/src/features/dag-runs/components/common/DAGRunActions.tsx

Comment thread api/v1/api.yaml
Comment thread internal/intg/edit_retry_api_test.go
Comment on lines +144 to +153
if opts.persistSpec {
if err := a.dagStore.UpdateSpec(ctx, request.Name, []byte(opts.specContent)); err != nil {
return nil, err
}
}

queued, err := a.launchEditRetryDAGRun(ctx, plan)
if err != nil {
return nil, err
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

persistSpec persists even when the retry launch fails.

a.dagStore.UpdateSpec runs before launchEditRetryDAGRun, so if launch fails (queue error, dispatch error, subprocess start error), the DAG file is already mutated to the edited spec while no corresponding run exists. That is surprising for an "Edit & Retry" action and can leave the stored DAG in a state the user didn't expect to ship without a successful run.

Consider either (a) persisting only after the launch succeeds, or (b) documenting this semantics in the API spec so callers know the spec write is independent of launch outcome.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/service/frontend/api/v1/dagruns_edit_retry.go` around lines 144 -
153, The code currently calls a.dagStore.UpdateSpec (guarded by
opts.persistSpec) before launching the retry, which can mutate stored DAGs even
if launchEditRetryDAGRun fails; move the persistence so it runs only after a
successful launch: call a.launchEditRetryDAGRun(ctx, plan) first, check err (and
queued if needed), then if opts.persistSpec call a.dagStore.UpdateSpec(ctx,
request.Name, []byte(opts.specContent)) and handle any UpdateSpec error (return
or wrap as appropriate). Use the existing symbols (opts.persistSpec,
a.dagStore.UpdateSpec, a.launchEditRetryDAGRun, queued) to locate and reorder
the operations.

Comment on lines +247 to +259
if err := validateDAGRunID(opts.newDAGRunID); err != nil {
validationErrors = append(validationErrors, err.Error())
}
if len(validationErrors) > 0 {
return &editRetryPlan{
sourceAttempt: attempt,
sourceDAGRunID: sourceDAGRunID,
sourceStatus: status,
sourceDAG: sourceDAG,
editedDAG: &core.DAG{Name: dagName},
newDAGRunID: opts.newDAGRunID,
}, validationErrors, nil
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP -C4 '^\s*func\s+validateDAGRunID\b'

Repository: dagucloud/dagu

Length of output: 768


🏁 Script executed:

#!/bin/bash
# Check the buildEditRetryPlan function and its context
rg -n "buildEditRetryPlan" internal/service/frontend/api/v1/dagruns_edit_retry.go | head -20

Repository: dagucloud/dagu

Length of output: 294


🏁 Script executed:

#!/bin/bash
# Check lines around 130-139 where execute path revalidates ID
sed -n '125,145p' internal/service/frontend/api/v1/dagruns_edit_retry.go

Repository: dagucloud/dagu

Length of output: 669


🏁 Script executed:

#!/bin/bash
# Check the broader context of the code snippet at 247-259
sed -n '240,270p' internal/service/frontend/api/v1/dagruns_edit_retry.go

Repository: dagucloud/dagu

Length of output: 1107


🏁 Script executed:

#!/bin/bash
# Check how validateDAGRunID is called in preview vs execute paths
rg -n "validateDAGRunID" internal/service/frontend/api/v1/dagruns_edit_retry.go

Repository: dagucloud/dagu

Length of output: 185


validateDAGRunID accepts empty IDs and is redundantly called during preview.

buildEditRetryPlan is shared between preview (line 73) and execute (line 122) paths. At preview time, opts.newDAGRunID is typically empty, and validateDAGRunID("") correctly returns nil. However, this check is redundant because the execute path regenerates and revalidates the ID at lines 130-139 if empty. Consider skipping validateDAGRunID(opts.newDAGRunID) at line 247 when opts.newDAGRunID == "" for clarity and to avoid unnecessary validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/service/frontend/api/v1/dagruns_edit_retry.go` around lines 247 -
259, The current buildEditRetryPlan calls validateDAGRunID(opts.newDAGRunID)
even when opts.newDAGRunID is empty (preview path) which is redundant; update
buildEditRetryPlan to skip calling validateDAGRunID when opts.newDAGRunID == ""
(only perform the validation when a non-empty newDAGRunID is provided), leaving
the execute-path logic that regenerates and revalidates the ID (the block that
handles empty newDAGRunID later) unchanged; reference validateDAGRunID,
buildEditRetryPlan, and opts.newDAGRunID to locate the change.

Comment thread internal/service/frontend/api/v1/dagruns_edit_retry.go
Comment on lines +717 to +726
func editRetryWarnings(skipped []string, ineligible []editRetryIneligibleStep) []string {
var warnings []string
if len(skipped) == 0 {
warnings = append(warnings, "no previously successful steps are eligible to skip")
}
if len(ineligible) > 0 {
warnings = append(warnings, "some previously successful steps cannot be skipped with the edited DAG")
}
return warnings
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Spurious “no eligible steps” warning when the edited DAG genuinely has none.

editRetryWarnings always emits "no previously successful steps are eligible to skip" whenever skipped is empty — including the case where no prior node succeeded at all, or the edit intentionally keeps no previously-successful step. This surfaces as a warning in the preview UI for perfectly valid edits (e.g., a retry of a DAG that had no successful steps yet).

Consider only emitting the warning when there were actually successful source steps that could have been skipped (e.g., when ineligible is non-empty OR when the caller explicitly opted in via skipSteps), and keeping the other branch to distinguish "couldn't skip anything" from "nothing to skip".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/service/frontend/api/v1/dagruns_edit_retry.go` around lines 717 -
726, The function editRetryWarnings currently emits "no previously successful
steps are eligible to skip" whenever skipped is empty; change it to only emit
that message when skipped is empty AND there was actually something to skip or
the caller explicitly requested skipping — e.g., add a boolean parameter
skipRequested (or an int successfulSourceCount) to editRetryWarnings and update
all callers, then change the condition to: if len(skipped) == 0 &&
(skipRequested || len(ineligible) > 0) { append the "no previously successful
steps are eligible to skip" warning }; keep the existing ineligible branch
unchanged so "some previously successful steps cannot be skipped..." still
appears when applicable.

Comment on lines +132 to +229
const previewEditRetrySpec = React.useCallback(
async (spec: string, persistSpec = editRetryPersistSpec) => {
if (!dagRun?.dagRunId || !spec.trim()) {
setEditRetryPreview(null);
setEditRetrySkipSteps([]);
return;
}

setEditRetryLoading(true);
try {
const { data, error } = await client.POST(
'/dag-runs/{name}/{dagRunId}/edit-retry/preview',
{
params: {
path: {
name,
dagRunId: dagRun.dagRunId,
},
query: {
remoteNode: appBarContext.selectedRemoteNode || 'local',
},
},
body: {
spec,
...(dagNameOverride ? { dagName: dagNameOverride } : {}),
persistSpec,
},
}
);
if (error) {
showError(
error.message || 'Failed to preview edited retry',
'Check if the DAG run is available and the YAML can be loaded.'
);
return;
}
if (data) {
setEditRetryPreview(data);
setEditRetrySkipSteps(data.skippedSteps ?? []);
}
} finally {
setEditRetryLoading(false);
}
},
[
appBarContext.selectedRemoteNode,
client,
dagNameOverride,
dagRun?.dagRunId,
editRetryPersistSpec,
name,
showError,
]
);

const loadEditRetrySpec = React.useCallback(async () => {
if (!dagRun?.dagRunId) {
return;
}

setEditRetryLoading(true);
try {
const { data, error } = await client.GET(
'/dag-runs/{name}/{dagRunId}/spec',
{
params: {
path: {
name,
dagRunId: dagRun.dagRunId,
},
query: {
remoteNode: appBarContext.selectedRemoteNode || 'local',
},
},
}
);
if (error) {
showError(
error.message || 'Failed to load DAG run spec',
'The stored DAG-run snapshot may no longer be available.'
);
return;
}
const spec = data?.spec ?? '';
setEditRetrySpec(spec);
await previewEditRetrySpec(spec, editRetryPersistSpec);
} finally {
setEditRetryLoading(false);
}
}, [
appBarContext.selectedRemoteNode,
client,
dagRun?.dagRunId,
editRetryPersistSpec,
name,
previewEditRetrySpec,
showError,
]);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing catch on preview/load: thrown errors become unhandled rejections.

previewEditRetrySpec and loadEditRetrySpec use try/finally but no catch. If client.POST / client.GET rejects (network failure, abort, etc.) instead of returning { error }, the promise rejection bubbles out of the void previewEditRetrySpec(...) / void loadEditRetrySpec() call sites as an unhandled rejection and the user sees nothing — only editRetryLoading gets cleared.

Add a catch that surfaces the failure via showError so the user is informed instead of silently losing the click.

As per coding guidelines: "Use error handling with try/catch for async operations".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/features/dag-runs/components/common/DAGRunActions.tsx` around lines
132 - 229, Both previewEditRetrySpec and loadEditRetrySpec currently use
try/finally but lack a catch block, so thrown rejections from client.POST/GET
become unhandled; add a catch in each function around the await client calls
that calls showError with a helpful message (e.g., include error.message or a
default) and then rethrows or returns as appropriate, keeping the existing
finally block to ensure setEditRetryLoading(false) still runs; update
previewEditRetrySpec and loadEditRetrySpec to handle network/abort exceptions
via try/catch/finally so errors surface to the user.

Comment on lines +737 to +750
<div className="space-y-1.5">
<Label htmlFor="edit-retry-dag-name" className="text-sm">
DAG Name Override (optional)
</Label>
<Input
id="edit-retry-dag-name"
placeholder={`Leave empty to use: ${dagRun?.name || 'original'}`}
value={dagNameOverride}
onChange={(event) =>
setDagNameOverride(event.target.value)
}
className="h-8 text-sm"
/>
</div>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

DAG Name Override is silently invalid when persistSpec is enabled.

The backend rejects any non-empty dagName when persistSpec is true (dagName cannot be used when persistSpec is true in buildEditRetryPlan), but the UI keeps this Input active alongside the "Save edited spec to DAG file" checkbox, and the override value is also sent to the preview endpoint. Users can enter an override, toggle persist, click Edit & Retry, and only then get a 400 back.

Consider disabling/clearing the DAG Name Override input (and the value sent in the request body) when editRetryPersistSpec is true, or gating the persist checkbox when an override is set.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ui/src/features/dag-runs/components/common/DAGRunActions.tsx` around lines
737 - 750, The DAG Name Override input (controlled by dagNameOverride and
setDagNameOverride) must be disabled and cleared when editRetryPersistSpec is
true to avoid backend 400s; update the Input with
disabled={editRetryPersistSpec} and onChange behavior so toggling
editRetryPersistSpec to true calls setDagNameOverride('') to clear it, and
ensure the request-building code that sends the preview / edit-retry payload
(where dagNameOverride is included) omits dagName when editRetryPersistSpec ===
true (i.e., only include dagName in the body if !editRetryPersistSpec &&
dagNameOverride is non-empty).

@yohamta0 yohamta0 merged commit 6d6f75d into main Apr 19, 2026
11 checks passed
@yohamta0 yohamta0 deleted the feat/edit-retry-dag-run branch April 19, 2026 13:41
zhnq pushed a commit to zhnq/dagu that referenced this pull request Apr 23, 2026
Merged features from upstream:
- feat: add embedded Dagu API (dagucloud#2011)
- feat: add edit-and-retry DAG runs (dagucloud#2010)
- feat: add bulk DAG-run deletion in web UI (dagucloud#2009)
- feat: add kubernetes secret provider (dagucloud#2006)
- feat: make workspace selection global (dagucloud#2015)
- feat: show cockpit run artifacts (dagucloud#2017)
- feat: allow disabling DAG retry policy (dagucloud#2018)
- feat: make DAG labels canonical, deprecate tags (dagucloud#2013)
- feat: add workflow design workspace (dagucloud#2012)
- feat: add step with/config alias (dagucloud#2021)
- fix: allow runtime custom step inputs (dagucloud#2005)
- fix: align embedded engine run labels (dagucloud#2014)
- Various CI/test/docs improvements

Conflict resolution:
- Kept fork i18n files (ui/src/i18n/, LanguageSwitcher)
- Merged i18n hooks with upstream structural changes in
  App.tsx, Layout.tsx, menu.tsx, DAGStatus.tsx, etc.
- Adopted upstream labels-over-tags rename (TagCombobox -> LabelCombobox)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant