Releases: Edwardvaneechoud/Flowfile
Release v0.8.1
Release v0.8.1
Highlights
Delta Lake Catalog Storage
Flowfile now uses Delta Lake as the storage layer for the catalog. This replaces static Parquet files with versioned, ACID-compliant tables. You can now perform time travel to query historical versions, view transaction logs, and perform atomic merges (upserts) without risk of data corruption.
Flow Parameters
Introducing flow-level variables that can be referenced across node settings using ${parameter_name} syntax. Parameters can be managed via a new Designer panel or overridden at runtime through the CLI using the --param flag.
Delta Lake Storage
The catalog storage layer has moved from standalone Parquet files to Delta Lake tables. This change enables managed storage with advanced data operations:
- Time Travel: Access previous versions of any catalog table directly from the UI.
- Advanced Write Modes: The Catalog Writer now supports
append,upsert,update, anddeleteusing configurable key columns. - Schema Evolution: Appending data now allows for automatic schema merging when new columns are detected.
- Metadata Offloading: Row counts, schema detection, and size calculations are now offloading to the worker process to keep the UI responsive.
Flow Parameters
Flows can now be parameterized for dynamic execution. By defining parameters at the flow level, you can inject values into file paths, SQL queries, or formulas.
- Dynamic Resolution: Use
${}syntax in node settings (e.g., a Read node path:C:/data/${current_month}/report.csv). - CLI Overrides: Pass parameters during execution via the command line:
flowfile run flow my_flow.json --param input_dir=/production/data --param threshold=50 - Designer Panel: A new draggable panel in the Designer allows you to define, default, and describe parameters as you build.
Changes
Core
- Bidirectional Formula Translation: Implemented
_ff_reprtracking in theFlowFrameAPI; Python expressions now automatically translate to visual Formula nodes when saved. - Idiomatic Code Export: The Python code generator now converts visual formulas into native Polars expressions (
pl.col(...)) instead of proprietary helper functions. - Expanded Math Formulas: Added support for
abs,round,ceil, andfloorin the formula engine. - Delta Utility Layer: Centralized format detection and I/O logic for Delta and legacy Parquet tables.
- Parameter Resolver: Recursive resolution engine for substituting
${}patterns in Pydantic models at runtime. - Catalog Migration: Added
migrate_parquet_to_delta.pyutility to convert existing catalog storage to the new format. - Pivot Optimization: Added zero-fill logic to
sum,count, andlenaggregations to maintain consistency with native Polars behavior. - Expanded Scope: Added
selectors(cs) andbase64to the execution scope for Polars Code nodes. - Subprocess Execution: Enhanced support for "frozen" environments (PyInstaller) when spawning flow runs via CLI.
UI
- Parameters Panel: New draggable UI for global parameter management.
- Version History UI: Added a historical version list and a "Viewing Historical Version" banner to the Catalog.
- Read Node Path Input: Added a manual path entry field to the Read node to support parameter injection.
- Merge Configuration: Added key column selection and mode descriptions to the Catalog Writer.
- Custom Cursors: Implemented high-contrast SVG cursors for the Canvas to improve visibility on Windows systems.
Infrastructure
- Dependency Updates: * Polars bumped to
< 1.40.polars-expr-transformerupdated to>= 0.5.3for bidirectional mapping.pl-fuzzy-frame-matchupdated to>= 0.6.0for hybrid matching and fuzzy filter strategies.
- Database Schema: Added
storage_formatcolumn to thecatalog_tablestable (defaults todelta). - Data Types: Added support for
Int128,UInt128, andFloat16across the engine and UI.
Fixes
- Pivot Alignment: Fixed missing combinations in pivot nodes to correctly fill with zeros instead of nulls for specific aggregations.
- Path Validation: Added strict validation for catalog table paths to prevent directory traversal.
- Unpivot Styling: Fixed background color issues for unpivot text selection in dark mode.
- Duration Handling: Corrected duration calculations for runs involving a mix of naive and timezone-aware datetimes.
- Node Resets: Fixed a bug where parameter substitution would trigger spurious node resets and clear cached analysis data.
What's Changed
- Bump rollup in /flowfile_wasm by @dependabot[bot] in #369
- Bump happy-dom from 20.8.4 to 20.8.9 in /flowfile_wasm by @dependabot[bot] in #372
- Bump polars version to 1.39.3 by @Edwardvaneechoud in #371
- Feature/add flow parameters by @Edwardvaneechoud in #364
- Handling flowfile as application by @Edwardvaneechoud in #373
- Feature/implement polars code to flowfile function by @Edwardvaneechoud in #370
- Modernize frontend: design tokens, unified tables, UX polish by @Edwardvaneechoud in #368
- add support for flowfile function to polars expression in to code module by @Edwardvaneechoud in #374
- Feature/catatalog to delta by @Edwardvaneechoud in #376
- Feature/implement crud operations catalog by @Edwardvaneechoud in #381
- Add refresh token support for Docker mode authentication by @Edwardvaneechoud in #375
Full Changelog: v0.8.0...v0.8.1
Release v0.8.0
Highlights
Flowfile now runs flows on a schedule. Set an interval, trigger on table updates, or run manually from the catalog — no external orchestrator needed. The scheduler runs embedded, standalone, or in Docker. This release also adds run management: trigger flows on demand, cancel running ones, and inspect execution logs.
Flow Scheduling
Run flows on a timer or trigger them when catalog tables update. Three schedule types: interval (every N minutes), table trigger (fires when a specific table is refreshed), and table set trigger (fires when all tables in a group have been refreshed). One active run per flow — if a flow is already running, new triggers are skipped until it finishes.
Schedules are managed from a new Schedules tab in the Catalog, or directly from a flow's detail panel. Create, enable/disable, run now, or delete — all inline.
Scheduler Modes
The scheduler runs wherever Flowfile runs:
- Embedded (desktop /
pip install flowfile) — start and stop from the Schedules tab - Standalone —
flowfile run flowfile_scheduleras an independent background service - Docker — set
FLOWFILE_SCHEDULER_ENABLED=truein your compose file
Only one scheduler instance runs at a time, enforced via an advisory database lock with heartbeat. If a scheduler dies, another can take over after 90 seconds.
Run Flows from the Catalog
Trigger any registered flow directly from its detail panel — no schedule required. The Run Flow button spawns a subprocess, tracks it in the run history, and writes execution logs to ~/.flowfile/logs/. Cancel a running flow at any time (sends SIGTERM to the process).
Table Trigger Architecture
Table triggers use a dual-path mechanism. The push path fires immediately when a Catalog Writer overwrites a table — no waiting for the next poll tick. The poll path (every ~30 seconds) acts as a safety net in case the push path fails. Double-firing is prevented by active run checks and timestamp comparison.
Run Logs
Scheduled, manual, and on-demand runs write output to log files. Click View log in a run's detail panel to see the full execution output.
Changes
Core
- Full scheduling system with interval, table trigger, and table set trigger types
- Scheduler engine with advisory lock, heartbeat, and stale-takeover logic
flowfile_scheduleras a new standalone package — lightweight, noflowfile_coredependencyflowfile run flowfile_schedulerCLI command for standalone modeFLOWFILE_SCHEDULER_ENABLEDenvironment variable for Docker auto-start- Run Flow from catalog (manual trigger without schedule)
- Cancel Run support — sends SIGTERM, marks run as failed
- Active runs tracking with live polling
- In-place table overwrite — Catalog Writer now preserves the table's ID and all foreign key references (schedules, read links, favorites) instead of delete-and-recreate
- Push-driven table trigger firing on
overwrite_table_data - Paginated run history with status counts (total, success, failed, running)
- Run types:
in_designer_run,scheduled,manual,on_demand - Run log file access from API
get_database_url()centralized in shared storage config- Shared lightweight SQLAlchemy models for cross-package database access
- Shared subprocess utilities for spawning flow runs
- Flow handler rekey for Save As operations
- Local execution mode — CLI runs skip worker offloading, write parquet directly, collect analysis data in-memory
UI
- Schedules tab in the Catalog with overview, summary cards, and schedule list
- Create Schedule modal with flow selector and type configuration
- Schedule detail panel with run history filtered by schedule
- Run overview panel with status breakdown
- Run Flow and Cancel Run buttons in flow detail panel
- Run log viewer in run detail panel
- Save As flow identity switching
- Fix background color for unpivot text in dark-mode
Infrastructure
- Docker image now bundles
flowfile_scheduler,flowfile, andflowfile_frame flowfile_scheduleradded to pyproject.toml packages and scripts- Scheduler lock table for single-instance enforcement
Fixes
- Catalog Writer table overwrite now preserves table ID and foreign keys instead of deleting and recreating
- Local (CLI) execution no longer attempts worker offloading for record counts and analysis data
- Duration calculation handles naive vs timezone-aware datetime correctly
- Docker kernel E2E test timing — added delay before polling to avoid false-positive early completion
- Lazy module import in CLI for faster startup
What's Changed
- fixing background color for unpivot text select by @Edwardvaneechoud in #363
- Bump happy-dom from 15.11.7 to 20.8.4 in /flowfile_wasm by @dependabot[bot] in #366
- Bump rollup from 4.56.0 to 4.60.0 in /flowfile_frontend by @dependabot[bot] in #367
- Feature/add scheduling service by @Edwardvaneechoud in #365
Release v0.7.3
What's Changed
- Fix/issue 356 handling excel headers by @Edwardvaneechoud in #359
Full Changelog: v0.7.2...v0.7.3
Release v0.7.2
What's Changed
- Fix routing in unified option by @Edwardvaneechoud in #357
Full Changelog: v0.7.1...v0.7.2
Release v0.7.1
- Hotfix/fix polars code parser vulnerability by @Edwardvaneechoud in #355
Full Changelog: https://github.com/Edwardvaneechoud/Flowfile/compare/v0.7.0..v0.7.1
Release v0.7.0
Flowfile v0.7.0 Release Notes
Highlights
Docker-Based Kernel Execution
Run your custom Python code in isolated Docker containers with your own packages and resource limits. Install any pip package you need — scikit-learn, transformers, custom internal libraries — and use them directly in your flow. Kernels are managed through a dedicated UI with live status monitoring, memory tracking, and auto-restart on failure.
Jupyter-Style Code Editor
Python Script nodes feature a full notebook editor with cell-by-cell execution, CodeMirror 6 with Python syntax highlighting, and flowfile API autocompletions. Variables persist across cells, outputs render inline, and the editor expands to fullscreen for focused work.
Rich Display Outputs
flowfile.display() renders matplotlib figures, plotly charts, PIL images, and HTML directly in the notebook. plt.show() is auto-captured — no explicit display call needed.
Flow Catalog
Unity Catalog-style namespace hierarchy (catalog > schema > flow) for organizing pipelines. Register flows with run history, version snapshots, and node-level results. Favorite flows, track table lineage, and open any historical version directly in the designer.
Artifact System
Publish, consume, and track Python objects across nodes within a flow. The system is DAG-aware — only upstream artifacts are visible. Artifacts persist through container restarts and can be shared globally via the catalog.
Named Inputs & Outputs
Python Script nodes support named connections with visual edge labels. read_input("orders") reads a specific named input, publish_output(df, name="cleaned") writes to a named output. Up to 10 inputs per node (#344).
Changes
Core
- Docker kernel system with container lifecycle management (#284)
- Kernel Manager UI with status monitoring and memory tracking (#284)
- Kernel auto-restart for stopped/errored kernels (#284)
- Kernel execution cancellation (#284)
- Flow Catalog with namespace hierarchy and flow registration (#285)
- Catalog table management and lineage tracking (#346)
- Documentation and favorite handling for catalog tables (#349)
- Catalog upload improvements (#348)
- Artifact publishing, consuming, and DAG-aware availability (#284)
- Global artifact sharing tied to catalog (#284)
- Named inputs/outputs for Python Script nodes (#344)
- File Manager for Docker mode file uploads (#326)
- Update functionality for database and cloud storage connections (#351)
- Added psycopg2-binary and pandas/sqlalchemy as production dependencies for database writes
UI
- Jupyter-style notebook editor with cell execution (#284)
- CodeMirror 6 editor with flowfile API autocompletions (#284)
- Rich display outputs — matplotlib, plotly, PIL, HTML (#284)
- Kernel execution in custom node designer (#284)
- Auto-generated node descriptions from config (#313)
- Embeddable FlowfileEditor as Vue component library (#338, #341)
- Z-index overflow fix with bounded constants (#314)
- DraggablePanel layout fix on viewport change (#305)
Infrastructure
- Fixed all ruff linting issues across the codebase (#347)
Fixes
- API calls failing when Docker deployment accessed remotely (#324)
- Parquet corruption in Docker volumes (#284)
Documentation
- Project documentation (#350)
Release feauture/kernel-implementation
Claude/embeddable flowfile wasm q fqi1 (#341)
Release v0.6.3
What's Changed
- Fix sandbox bypass and frontend/backend desync in file explorer by @Edwardvaneechoud in #280
Full Changelog: v0.6.2...v0.6.3
Release v0.6.2
Multiple hotfixes to ensure better stability
-
- Enforce lazy evaluation of predicted schema by @Edwardvaneechoud in #279
Full Changelog: v0.6.1...V0.6.2
Release v0.6.1
What's Changed
- Add code coverage tracking and reporting to CI pipeline by @Edwardvaneechoud in #274
- Add comprehensive test suite for core modules by @Edwardvaneechoud in #276
- Route lightweight transforms to local samping strategy by @Edwardvaneechoud in #272
Full Changelog: v0.6.0...v0.6.1