Skip to content

Route lightweight transforms to local samping strategy#272

Merged
Edwardvaneechoud merged 14 commits intomainfrom
claude/add-flowfile-cor-tests-HSZny
Jan 29, 2026
Merged

Route lightweight transforms to local samping strategy#272
Edwardvaneechoud merged 14 commits intomainfrom
claude/add-flowfile-cor-tests-HSZny

Conversation

@Edwardvaneechoud
Copy link
Copy Markdown
Owner

Summary

This PR implements intelligent routing of execution strategies based on node types. Lightweight transforms (select, sample, union, sort, record_count) now use LOCAL_WITH_SAMPLING strategy when running remotely, enabling fast local computation with external sampling for preview data. Wide transforms are further optimized by overriding back to REMOTE when optimizing for downstream nodes.

Key Changes

  • Strategy Determination Logic: Modified _determine_strategy() to check if a node has has_default_settings=True and route to LOCAL_WITH_SAMPLING for remote execution, while maintaining FULL_LOCAL for local execution and REMOTE for nodes without defaults.

  • Wide Transform Override: Added logic in execute() to override wide transforms (sort, record_count) from LOCAL_WITH_SAMPLING back to REMOTE when optimizing for downstream nodes, ensuring full data is available for dependent operations.

  • Enhanced Logging: Improved execution logging to include node type, transform type, default settings status, and strategy override information for better debugging and observability.

  • Comprehensive Test Coverage: Added extensive test suite covering:

    • Strategy determination for different node types (select, sort, manual_input)
    • Execution decisions based on invalidation reasons
    • Wide transform override behavior
    • Verification of all nodes with defaults and their transform types

Implementation Details

  • Nodes with node_default.has_default_settings=True are considered lightweight transforms suitable for local execution with sampling
  • The override mechanism preserves the original decision reason while changing only the strategy
  • Local execution mode (run_location="local") always uses FULL_LOCAL, bypassing the sampling strategy
  • Test helpers were refactored to support creating graphs with different node types and configurations

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d

_determine_strategy never returned LOCAL_WITH_SAMPLING — all remote
execution went to REMOTE regardless of node type. Nodes registered in
nodes_with_defaults (select, sample, union, sort, record_count) are
lightweight transforms that can compute locally with an external
sampler providing preview data.

Changes:
- Route nodes with has_default_settings to LOCAL_WITH_SAMPLING in
  _determine_strategy (wide transforms are still overridden to REMOTE
  by the existing optimize_for_downstream logic in execute())
- Add detailed strategy logging: node_type, transform_type,
  has_default, and whether the strategy was overridden
- Add comprehensive executor tests covering strategy determination
  for select, sort, and manual_input nodes, the wide-transform
  override logic, and parametrized tests for all nodes_with_defaults

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
…PLING routing

The condition for LOCAL_WITH_SAMPLING should be based on whether the
node is a narrow transform, not whether it has default settings.
nodes_with_defaults is about whether a node can run without user
config — it's unrelated to execution strategy.

Narrow transforms (select, sample, union) are lightweight column-level
operations safe to compute locally. Wide transforms (sort, record_count)
reshape data and should always go REMOTE.

Updated tests to reflect that sort goes directly to REMOTE from
_determine_strategy, not via the override in execute().

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
When cache_results is enabled on a node, the result must be fully
materialized and stored by the remote worker. This means even narrow
transforms (select, sample, union) that would normally use
LOCAL_WITH_SAMPLING must go REMOTE.

Strategy priority in _determine_strategy:
1. local → FULL_LOCAL
2. cache_results → REMOTE
3. narrow transform → LOCAL_WITH_SAMPLING
4. everything else → REMOTE

Added TestCacheResultsStrategy with 5 tests covering cache override
for narrow/wide transforms and the full decision flow.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
@netlify
Copy link
Copy Markdown

netlify bot commented Jan 28, 2026

Deploy Preview for flowfile-wasm canceled.

Name Link
🔨 Latest commit 821b762
🔍 Latest deploy log https://app.netlify.com/projects/flowfile-wasm/deploys/697bac25a8117700083796b1

claude and others added 10 commits January 28, 2026 22:04
Both the fast-path (_can_skip_execution_fast) and the executor
(_decide_execution) used to fall through to re-execution when a node
had already run but no external cache file existed on disk. This
happens for LOCAL_WITH_SAMPLING nodes where results live in memory.

Now:
- Already ran + no cache_results → skip (results in memory)
- Already ran + cache_results + cache present → skip
- Already ran + cache_results + cache missing → re-run (rebuild cache)
- Never ran → run with appropriate strategy
- Forced refresh / output node / source changed → run

Removed the PERFORMANCE_MODE and SETTINGS_CHANGED re-run paths for
already-ran nodes. The fast-path on FlowNode already skipped
performance mode, and SETTINGS_CHANGED was misleading (settings
hadn't changed — the cache file was just missing).

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
Remove verbose debug info from log line ([node_type=..., transform_type=...,
has_default=...]) and the unused initial_strategy variable. The log now
shows just: "Starting to run select (NEVER_RAN -> LOCAL_WITH_SAMPLING)"

Add "Execution Strategy" section to the Core Internals developer docs
covering the three strategies, the decision pipeline, LOCAL_WITH_SAMPLING
vs REMOTE flow diagrams, and node classification table.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
The docs described the strategy routing but didn't connect it to the
execution modes visible in the frontend. Restructured to lead with
what users see (Development vs Performance mode) and then explain the
underlying strategies. Added a section for each mode, clarified that
LOCAL_WITH_SAMPLING only applies in Development mode, and updated the
node classification table with a Performance mode column.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
Sort (reordering) is a wide operation. Narrow transforms do dropping,
renaming, and filtering.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
Tests the full preview data path:
- TestPreviewAfterExecution: preview returns sample data after run,
  caps at 100 rows, returns empty before run, doesn't recompute,
  caches across repeated calls
- TestPreviewAfterUpstreamChange: documents that node_stats.has_completed_last_run
  is NOT reset when settings change, causing stale preview data.
  Also verifies preview is correct after re-running.

Adds create_graph_with_read_and_select helper using fake_data.parquet
(1000 rows) for integration-level preview testing.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
When sibling nodes run in parallel and both serialize the same
upstream LazyFrame, Polars' internal RefCell can raise "Already
borrowed". This is a thread contention issue in the core process
during LazyFrame serialization, not a logic error.

After a parallel stage completes, any node that failed with
"Already borrowed" is now retried sequentially — the contention
is gone because no other thread is accessing the shared LazyFrame.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
When sibling nodes run in parallel and both access the same upstream
LazyFrame, Polars' internal RefCell raises "Already borrowed".

Fix: each node acquires its input nodes' _execution_lock before
calling _function(*input_data), holding it through the LazyFrame
access. Sibling nodes sharing a parent now serialize their access
to the parent's data. Nodes with different parents still run fully
parallel.

Reverts the retry-based approach in favor of preventing the
conflict entirely.

https://claude.ai/code/session_01DaTkkXLAJyC2EfopwiY55d
@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Edwardvaneechoud Edwardvaneechoud merged commit 17c700c into main Jan 29, 2026
23 checks passed
@Edwardvaneechoud Edwardvaneechoud deleted the claude/add-flowfile-cor-tests-HSZny branch January 29, 2026 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants