Release v0.8.1

@Edwardvaneechoud

Release v0.8.1

Highlights
Delta Lake Catalog Storage
Flowfile now uses Delta Lake as the storage layer for the catalog. This replaces static Parquet files with versioned, ACID-compliant tables. You can now perform time travel to query historical versions, view transaction logs, and perform atomic merges (upserts) without risk of data corruption.

Flow Parameters
Introducing flow-level variables that can be referenced across node settings using ${parameter_name} syntax. Parameters can be managed via a new Designer panel or overridden at runtime through the CLI using the --param flag.

Delta Lake Storage

The catalog storage layer has moved from standalone Parquet files to Delta Lake tables. This change enables managed storage with advanced data operations:

Time Travel: Access previous versions of any catalog table directly from the UI.
Advanced Write Modes: The Catalog Writer now supports append, upsert, update, and delete using configurable key columns.
Schema Evolution: Appending data now allows for automatic schema merging when new columns are detected.
Metadata Offloading: Row counts, schema detection, and size calculations are now offloading to the worker process to keep the UI responsive.

Flow Parameters

Flows can now be parameterized for dynamic execution. By defining parameters at the flow level, you can inject values into file paths, SQL queries, or formulas.

Dynamic Resolution: Use ${} syntax in node settings (e.g., a Read node path: C:/data/${current_month}/report.csv).
CLI Overrides: Pass parameters during execution via the command line:
flowfile run flow my_flow.json --param input_dir=/production/data --param threshold=50
Designer Panel: A new draggable panel in the Designer allows you to define, default, and describe parameters as you build.

Changes

Core

Bidirectional Formula Translation: Implemented _ff_repr tracking in the FlowFrame API; Python expressions now automatically translate to visual Formula nodes when saved.
Idiomatic Code Export: The Python code generator now converts visual formulas into native Polars expressions (pl.col(...)) instead of proprietary helper functions.
Expanded Math Formulas: Added support for abs, round, ceil, and floor in the formula engine.
Delta Utility Layer: Centralized format detection and I/O logic for Delta and legacy Parquet tables.
Parameter Resolver: Recursive resolution engine for substituting ${} patterns in Pydantic models at runtime.
Catalog Migration: Added migrate_parquet_to_delta.py utility to convert existing catalog storage to the new format.
Pivot Optimization: Added zero-fill logic to sum, count, and len aggregations to maintain consistency with native Polars behavior.
Expanded Scope: Added selectors (cs) and base64 to the execution scope for Polars Code nodes.
Subprocess Execution: Enhanced support for "frozen" environments (PyInstaller) when spawning flow runs via CLI.

UI

Parameters Panel: New draggable UI for global parameter management.
Version History UI: Added a historical version list and a "Viewing Historical Version" banner to the Catalog.
Read Node Path Input: Added a manual path entry field to the Read node to support parameter injection.
Merge Configuration: Added key column selection and mode descriptions to the Catalog Writer.
Custom Cursors: Implemented high-contrast SVG cursors for the Canvas to improve visibility on Windows systems.

Infrastructure

Dependency Updates: * Polars bumped to < 1.40.
- polars-expr-transformer updated to >= 0.5.3 for bidirectional mapping.
- pl-fuzzy-frame-match updated to >= 0.6.0 for hybrid matching and fuzzy filter strategies.
Database Schema: Added storage_format column to the catalog_tables table (defaults to delta).
Data Types: Added support for Int128, UInt128, and Float16 across the engine and UI.

Fixes

Pivot Alignment: Fixed missing combinations in pivot nodes to correctly fill with zeros instead of nulls for specific aggregations.
Path Validation: Added strict validation for catalog table paths to prevent directory traversal.
Unpivot Styling: Fixed background color issues for unpivot text selection in dark mode.
Duration Handling: Corrected duration calculations for runs involving a mix of naive and timezone-aware datetimes.
Node Resets: Fixed a bug where parameter substitution would trigger spurious node resets and clear cached analysis data.

What's Changed

Bump rollup in /flowfile_wasm by @dependabot[bot] in #369
Bump happy-dom from 20.8.4 to 20.8.9 in /flowfile_wasm by @dependabot[bot] in #372
Bump polars version to 1.39.3 by @Edwardvaneechoud in #371
Feature/add flow parameters by @Edwardvaneechoud in #364
Handling flowfile as application by @Edwardvaneechoud in #373
Feature/implement polars code to flowfile function by @Edwardvaneechoud in #370
Modernize frontend: design tokens, unified tables, UX polish by @Edwardvaneechoud in #368
add support for flowfile function to polars expression in to code module by @Edwardvaneechoud in #374
Feature/catatalog to delta by @Edwardvaneechoud in #376
Feature/implement crud operations catalog by @Edwardvaneechoud in #381
Add refresh token support for Docker mode authentication by @Edwardvaneechoud in #375

Full Changelog: v0.8.0...v0.8.1

@Edwardvaneechoud

Highlights

Flowfile now runs flows on a schedule. Set an interval, trigger on table updates, or run manually from the catalog — no external orchestrator needed. The scheduler runs embedded, standalone, or in Docker. This release also adds run management: trigger flows on demand, cancel running ones, and inspect execution logs.

Flow Scheduling

Run flows on a timer or trigger them when catalog tables update. Three schedule types: interval (every N minutes), table trigger (fires when a specific table is refreshed), and table set trigger (fires when all tables in a group have been refreshed). One active run per flow — if a flow is already running, new triggers are skipped until it finishes.

Schedules are managed from a new Schedules tab in the Catalog, or directly from a flow's detail panel. Create, enable/disable, run now, or delete — all inline.

Scheduler Modes

The scheduler runs wherever Flowfile runs:

Embedded (desktop / pip install flowfile) — start and stop from the Schedules tab
Standalone — flowfile run flowfile_scheduler as an independent background service
Docker — set FLOWFILE_SCHEDULER_ENABLED=true in your compose file

Only one scheduler instance runs at a time, enforced via an advisory database lock with heartbeat. If a scheduler dies, another can take over after 90 seconds.

Run Flows from the Catalog

Trigger any registered flow directly from its detail panel — no schedule required. The Run Flow button spawns a subprocess, tracks it in the run history, and writes execution logs to ~/.flowfile/logs/. Cancel a running flow at any time (sends SIGTERM to the process).

Table Trigger Architecture

Table triggers use a dual-path mechanism. The push path fires immediately when a Catalog Writer overwrites a table — no waiting for the next poll tick. The poll path (every ~30 seconds) acts as a safety net in case the push path fails. Double-firing is prevented by active run checks and timestamp comparison.

Run Logs

Scheduled, manual, and on-demand runs write output to log files. Click View log in a run's detail panel to see the full execution output.

Changes

Core

Full scheduling system with interval, table trigger, and table set trigger types
Scheduler engine with advisory lock, heartbeat, and stale-takeover logic
flowfile_scheduler as a new standalone package — lightweight, no flowfile_core dependency
flowfile run flowfile_scheduler CLI command for standalone mode
FLOWFILE_SCHEDULER_ENABLED environment variable for Docker auto-start
Run Flow from catalog (manual trigger without schedule)
Cancel Run support — sends SIGTERM, marks run as failed
Active runs tracking with live polling
In-place table overwrite — Catalog Writer now preserves the table's ID and all foreign key references (schedules, read links, favorites) instead of delete-and-recreate
Push-driven table trigger firing on overwrite_table_data
Paginated run history with status counts (total, success, failed, running)
Run types: in_designer_run, scheduled, manual, on_demand
Run log file access from API
get_database_url() centralized in shared storage config
Shared lightweight SQLAlchemy models for cross-package database access
Shared subprocess utilities for spawning flow runs
Flow handler rekey for Save As operations
Local execution mode — CLI runs skip worker offloading, write parquet directly, collect analysis data in-memory

UI

Schedules tab in the Catalog with overview, summary cards, and schedule list
Create Schedule modal with flow selector and type configuration
Schedule detail panel with run history filtered by schedule
Run overview panel with status breakdown
Run Flow and Cancel Run buttons in flow detail panel
Run log viewer in run detail panel
Save As flow identity switching
Fix background color for unpivot text in dark-mode

Infrastructure

Docker image now bundles flowfile_scheduler, flowfile, and flowfile_frame
flowfile_scheduler added to pyproject.toml packages and scripts
Scheduler lock table for single-instance enforcement

Fixes

Catalog Writer table overwrite now preserves table ID and foreign keys instead of deleting and recreating
Local (CLI) execution no longer attempts worker offloading for record counts and analysis data
Duration calculation handles naive vs timezone-aware datetime correctly
Docker kernel E2E test timing — added delay before polling to avoid false-positive early completion
Lazy module import in CLI for faster startup

What's Changed

fixing background color for unpivot text select by @Edwardvaneechoud in #363
Bump happy-dom from 15.11.7 to 20.8.4 in /flowfile_wasm by @dependabot[bot] in #366
Bump rollup from 4.56.0 to 4.60.0 in /flowfile_frontend by @dependabot[bot] in #367
Feature/add scheduling service by @Edwardvaneechoud in #365

@Edwardvaneechoud

What's Changed

Fix/issue 356 handling excel headers by @Edwardvaneechoud in #359

Full Changelog: v0.7.2...v0.7.3

@Edwardvaneechoud

What's Changed

Fix routing in unified option by @Edwardvaneechoud in #357

Full Changelog: v0.7.1...v0.7.2

@Edwardvaneechoud

Hotfix/fix polars code parser vulnerability by @Edwardvaneechoud in #355

Full Changelog: https://github.com/Edwardvaneechoud/Flowfile/compare/v0.7.0..v0.7.1

Flowfile v0.7.0 Release Notes

Highlights

Docker-Based Kernel Execution

Run your custom Python code in isolated Docker containers with your own packages and resource limits. Install any pip package you need — scikit-learn, transformers, custom internal libraries — and use them directly in your flow. Kernels are managed through a dedicated UI with live status monitoring, memory tracking, and auto-restart on failure.

Jupyter-Style Code Editor

Python Script nodes feature a full notebook editor with cell-by-cell execution, CodeMirror 6 with Python syntax highlighting, and flowfile API autocompletions. Variables persist across cells, outputs render inline, and the editor expands to fullscreen for focused work.

Rich Display Outputs

flowfile.display() renders matplotlib figures, plotly charts, PIL images, and HTML directly in the notebook. plt.show() is auto-captured — no explicit display call needed.

Flow Catalog

Unity Catalog-style namespace hierarchy (catalog > schema > flow) for organizing pipelines. Register flows with run history, version snapshots, and node-level results. Favorite flows, track table lineage, and open any historical version directly in the designer.

Artifact System

Publish, consume, and track Python objects across nodes within a flow. The system is DAG-aware — only upstream artifacts are visible. Artifacts persist through container restarts and can be shared globally via the catalog.

Named Inputs & Outputs

Python Script nodes support named connections with visual edge labels. read_input("orders") reads a specific named input, publish_output(df, name="cleaned") writes to a named output. Up to 10 inputs per node (#344).

Changes

Core

Docker kernel system with container lifecycle management (#284)
Kernel Manager UI with status monitoring and memory tracking (#284)
Kernel auto-restart for stopped/errored kernels (#284)
Kernel execution cancellation (#284)
Flow Catalog with namespace hierarchy and flow registration (#285)
Catalog table management and lineage tracking (#346)
Documentation and favorite handling for catalog tables (#349)
Catalog upload improvements (#348)
Artifact publishing, consuming, and DAG-aware availability (#284)
Global artifact sharing tied to catalog (#284)
Named inputs/outputs for Python Script nodes (#344)
File Manager for Docker mode file uploads (#326)
Update functionality for database and cloud storage connections (#351)
Added psycopg2-binary and pandas/sqlalchemy as production dependencies for database writes

UI

Jupyter-style notebook editor with cell execution (#284)
CodeMirror 6 editor with flowfile API autocompletions (#284)
Rich display outputs — matplotlib, plotly, PIL, HTML (#284)
Kernel execution in custom node designer (#284)
Auto-generated node descriptions from config (#313)
Embeddable FlowfileEditor as Vue component library (#338, #341)
Z-index overflow fix with bounded constants (#314)
DraggablePanel layout fix on viewport change (#305)

Infrastructure

Fixed all ruff linting issues across the codebase (#347)

Fixes

API calls failing when Docker deployment accessed remotely (#324)
Parquet corruption in Docker volumes (#284)

Documentation

Project documentation (#350)

@Edwardvaneechoud

What's Changed

Fix sandbox bypass and frontend/backend desync in file explorer by @Edwardvaneechoud in #280

Full Changelog: v0.6.2...v0.6.3

@Edwardvaneechoud

Multiple hotfixes to ensure better stability

- Enforce lazy evaluation of predicted schema by @Edwardvaneechoud in #279

Full Changelog: v0.6.1...V0.6.2

@Edwardvaneechoud

What's Changed

Add code coverage tracking and reporting to CI pipeline by @Edwardvaneechoud in #274
Add comprehensive test suite for core modules by @Edwardvaneechoud in #276
Route lightweight transforms to local samping strategy by @Edwardvaneechoud in #272

Full Changelog: v0.6.0...v0.6.1

Releases: Edwardvaneechoud/Flowfile

Release v0.8.1

Release v0.8.1

Delta Lake Storage

Flow Parameters

Changes

Core

UI

Infrastructure

Fixes

What's Changed

Contributors

Uh oh!

Release v0.8.0

Highlights

Flow Scheduling

Scheduler Modes

Run Flows from the Catalog

Table Trigger Architecture

Run Logs

Changes

Core

UI

Infrastructure

Fixes

What's Changed

Contributors

Uh oh!

Release v0.7.3

What's Changed

Contributors

Uh oh!

Release v0.7.2

What's Changed

Contributors

Uh oh!

Release v0.7.1

Contributors

Uh oh!

Release v0.7.0

Flowfile v0.7.0 Release Notes

Highlights

Docker-Based Kernel Execution

Jupyter-Style Code Editor

Rich Display Outputs

Flow Catalog

Artifact System

Named Inputs & Outputs

Changes

Core

UI

Infrastructure

Fixes

Documentation

Uh oh!

Release feauture/kernel-implementation

Uh oh!

Release v0.6.3

What's Changed

Contributors

Uh oh!

Release v0.6.2

Multiple hotfixes to ensure better stability

Contributors

Uh oh!

Release v0.6.1

What's Changed

Contributors

Uh oh!