Skip to content

DataFusion: support stopping a specific pipeline run (CDAP Stop a Program Run) in DataFusionHook / Stop operator #61224

@shahar1

Description

@shahar1

Description

The Google provider’s Data Fusion integration can start a pipeline and returns a run_id (aka “pipeline_id” in Airflow operators), but the stop functionality only stops the program (not a specific run). CDAP/Data Fusion supports stopping a specific run via “Stop a Program Run”, and Airflow should expose that to avoid stopping an arbitrary run when multiple runs are active.

Use case/motivation

In CDAP, workflows can have multiple concurrent runs. The current “Stop a Program” endpoint:

POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/stop

“…will stop one of the runs, but not all of the runs.” (CDAP Lifecycle Microservices docs)

Airflow already tracks a specific runId returned by DataFusionStartPipelineOperator (via XCom / returned value). Users need to stop that specific run deterministically, e.g. on DAG cancellation, failure cleanup, or manual stop workflows.

CDAP provides a precise endpoint:

POST /v3/namespaces/<namespace-id>/apps/<app-id>/<program-type>/<program-id>/runs/<run-id>/stop

Airflow should support calling that endpoint when a runId is available.

Related issues

#60688

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Disclaimer: This issue was generated by GPT 5.2, under my supervision.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions