Skip to content

Add kernel persistence and multi-user access control#286

Merged
Edwardvaneechoud merged 4 commits intofeauture/kernel-implementationfrom
claude/persist-kernels-database-XJeP9
Feb 1, 2026
Merged

Add kernel persistence and multi-user access control#286
Edwardvaneechoud merged 4 commits intofeauture/kernel-implementationfrom
claude/persist-kernels-database-XJeP9

Conversation

@Edwardvaneechoud
Copy link
Copy Markdown
Owner

@Edwardvaneechoud Edwardvaneechoud commented Feb 1, 2026

Summary

This PR adds database persistence for kernel configurations and implements multi-user access control for the kernel API. Kernels now survive core process restarts, and users can only access their own kernels.

Key Changes

Database & Persistence

  • Added Kernel model to store kernel configurations (id, name, packages, resource limits, user ownership)
  • Created flowfile_core/kernel/persistence.py module with CRUD operations for kernel persistence
  • Kernels are saved to the database when created and restored on startup
  • Only configuration is persisted; runtime state (container_id, port) is ephemeral and reconstructed by reclaiming running Docker containers

Kernel Manager Updates

  • Added _kernel_owners dict to track which user owns each kernel
  • Implemented _restore_kernels_from_db() to load persisted kernels on startup
  • Added _persist_kernel() and _remove_kernel_from_db() helper methods
  • Updated create_kernel() to accept user_id parameter and persist the kernel
  • Updated delete_kernel() to remove kernel from database
  • Added shutdown_all() method to gracefully stop all running containers during core shutdown
  • Improved _reclaim_running_containers() to handle orphan containers (stops containers with no DB record)
  • Added get_kernel_owner() method to retrieve kernel ownership
  • Updated list_kernels() to optionally filter by user_id

API Authorization

  • Added authentication dependency to all kernel routes using get_current_active_user
  • Implemented ownership checks on all kernel endpoints (get, delete, start, stop, execute, artifacts, clear)
  • Returns 403 Forbidden when users attempt to access kernels they don't own
  • Updated list_kernels() endpoint to return only the current user's kernels

Graceful Shutdown

  • Added _shutdown_kernels() function called during application shutdown
  • Ensures all running kernel containers are properly stopped when the core service terminates

Testing

  • Updated kernel fixtures to pass user_id=1 when creating test kernels

Implementation Details

  • Kernel packages are stored as JSON-serialized strings in the database
  • Database operations are wrapped in try-except blocks to prevent persistence failures from blocking kernel operations
  • Orphan containers (running but not in database) are automatically cleaned up during startup
  • The Float type was added to SQLAlchemy imports to support CPU and memory resource specifications

Kernels are now stored in a `kernels` table (tied to user_id) so they
survive core process restarts.  On startup the KernelManager restores
persisted configs from the DB, then reclaims any running Docker
containers that match; orphan containers with no DB record are stopped.

All kernel REST routes now require authentication and enforce per-user
ownership (list returns only the caller's kernels, mutations check
ownership before proceeding).

On core shutdown (lifespan handler, SIGTERM, SIGINT) every running
kernel container is stopped and removed via `shutdown_all()`.

https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC
Resolve import conflict in routes.py: combine auth (Depends,
get_current_active_user) with logging and DockerStatus from the
target branch.

https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC
start_kernel now explicitly checks for the flowfile-kernel image
before attempting to run a container, giving a clear error message
("Docker image 'flowfile-kernel' not found. Please build or pull...")
instead of a raw Docker API exception.

https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC
Kernels restored from the database have port=None since ports are
ephemeral and not persisted. start_kernel now calls _allocate_port()
when kernel.port is None, fixing the "Invalid port: 'None'" error
that occurred when starting a kernel after a core restart.

https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC
@Edwardvaneechoud Edwardvaneechoud merged commit 9ee26f6 into feauture/kernel-implementation Feb 1, 2026
@Edwardvaneechoud Edwardvaneechoud deleted the claude/persist-kernels-database-XJeP9 branch February 1, 2026 09:36
Edwardvaneechoud added a commit that referenced this pull request Feb 23, 2026
* Add kernel runtime management with Docker containerization (#281)
* Add Docker-based kernel system for isolated Python code execution
Introduces two components:
- kernel_runtime/: Standalone FastAPI service that runs inside Docker
  containers, providing code execution with artifact storage and
  parquet-based data I/O via the flowfile client API
- flowfile_core/kernel/: Orchestration layer that manages kernel
  containers (create, start, stop, delete, execute) using docker-py,
  with full REST API routes integrated into the core backend
* Add python_script node type for kernel-based code execution
- PythonScriptInput/NodePythonScript schemas in input_schema.py
- add_python_script method in flow_graph.py that stages input
  parquet to shared volume, executes on kernel, reads output back
- get_kernel_manager singleton in kernel/__init__.py
- python_script node template registered in node_store
* Add integration tests for Docker-based kernel system
- kernel_fixtures.py: builds the flowfile-kernel Docker image, creates
  a KernelManager with a temp shared volume, starts a container, and
  tears everything down via a managed_kernel() context manager
- conftest.py: adds session-scoped kernel_manager fixture
- test_kernel_integration.py: full integration tests covering:
  - TestKernelRuntime: health check, stdout/stderr capture, syntax
    errors, artifact publish/list, parquet read/write round-trip,
    multiple named inputs, execution timing
  - TestPythonScriptNode: python_script node passthrough and transform
    via FlowGraph.run_graph(), plus missing kernel_id error handling
- manager.py: expose shared_volume_path as public property
- flow_graph.py: use public property instead of private attribute
* update poetry version
* Fix kernel system: singleton routing, state machine, sync execution, port resilience
- routes.py: use get_kernel_manager() singleton instead of creating a
  separate KernelManager instance (was causing dual-state bug)
- models.py: replace RUNNING with IDLE/EXECUTING states; store
  memory_gb, cpu_cores, gpu on KernelInfo from KernelConfig
- manager.py: add _reclaim_running_containers() on init to discover
  existing flowfile-kernel-* containers and reclaim their ports;
  port allocation now scans for available ports instead of incrementing;
  add execute_sync() using httpx.Client for clean sync usage;
  state transitions: IDLE -> EXECUTING -> IDLE during execute()
- flow_graph.py: use execute_sync() instead of fragile
  asyncio.run/ThreadPoolExecutor dance
- test: update state assertion from "running" to "idle"
* Fix kernel health check and test fixture resilience
- _wait_for_healthy: catch all httpx errors (including
  RemoteProtocolError) during startup polling, not just
  ConnectError/ReadError/ConnectTimeout
- conftest kernel_manager fixture: wrap managed_kernel() in
  try/except so container start failures produce pytest.skip
  instead of ERROR
* removing breakpoint
* Run kernel integration tests in parallel CI worker
- Add pytest.mark.kernel marker to test_kernel_integration.py
- Register 'kernel' marker in pyproject.toml
- Exclude kernel tests from main backend-tests with -m "not kernel"
  (both Linux and Windows jobs)
- Add dedicated kernel-tests job that runs in parallel:
  builds Docker image, runs -m kernel tests, 15min timeout
- Add kernel_runtime paths to change detection filters
- Include kernel-tests in test-summary aggregation
* Remove --timeout flag from kernel CI step (pytest-timeout not installed)
The job-level timeout-minutes: 15 already handles this.
* Add unit tests for kernel_runtime (artifact_store, flowfile_client, endpoints)
42 tests covering the three kernel_runtime modules:
- artifact_store: publish/get/list/clear, metadata, thread safety
- flowfile_client: context management, parquet I/O, artifacts
- main.py endpoints: /health, /execute, /artifacts, /clear, parquet round-trips
* Add *.egg-info/ to .gitignore
* Add kernel_runtime unit tests to CI kernel-tests job
Installs kernel_runtime with test deps and runs its 42 unit tests
before the Docker-dependent integration tests.
* Add Kernel Manager UI for Python execution environments (#282)
* Add kernel management UI for Python execution environments
Provides a visual interface for managing Docker-based kernel containers
used by Python Script nodes. Users can create kernels with custom packages
and resource limits, monitor status (stopped/starting/idle/executing/error),
and control lifecycle (start/stop/delete) with auto-polling for live updates.
* Update package-lock.json version to match package.json
* Handle Docker unavailable gracefully with 503 and error banner
The kernel routes now catch DockerException during manager init and
return a 503 with a clear message instead of crashing with a 500.
The frontend surfaces this as a red error banner at the top of the
Kernel Manager page so users know Docker needs to be running.
* Add /kernels/docker-status endpoint and proactive UI feedback
New GET /kernels/docker-status endpoint checks Docker daemon reachability
and whether the flowfile-kernel image exists. The UI calls this on page
load and shows targeted banners: red for Docker not running, yellow for
missing kernel image, so users know exactly what to fix before creating
kernels.
* Center kernel manager page with margin auto and padding
Match the layout pattern used by other views (SecretsView, DatabaseView)
with max-width 1200px, margin 0 auto, and standard spacing-5 padding.
* Add artifact context tracking for python_script nodes (#283)
* Add ArtifactContext for tracking artifact metadata across FlowGraph
Introduces an ArtifactContext class that tracks which Python artifacts
are published and consumed by python_script nodes, enabling visibility
into artifact availability based on graph topology and kernel isolation.
- Create artifacts.py with ArtifactRef, NodeArtifactState, ArtifactContext
- Integrate ArtifactContext into FlowGraph.__init__
- Add _get_upstream_node_ids and _get_required_kernel_ids helpers
- Clear artifact context at flow start in run_graph()
- Compute available artifacts before and record published after execution
- Add clear_artifacts_sync to KernelManager for non-async clearing
- Add 32 unit tests for ArtifactContext (test_artifact_context.py)
- Add 7 FlowGraph integration tests (test_flowfile.py)
- Add 5 kernel integration tests (test_kernel_integration.py)
* Add delete_artifact support, duplicate publish prevention, and model training integration test
- ArtifactStore.publish() now raises ValueError if artifact name already exists
- Added ArtifactStore.delete() and flowfile_client.delete_artifact()
- ExecuteResult/ExecuteResponse track artifacts_deleted alongside artifacts_published
- ArtifactContext.record_deleted() removes artifacts from kernel index and published lists
- flow_graph.add_python_script records deletions from execution results
- Integration test: train numpy linear regression in node A, apply predictions in node B
- Integration test: publish -> use & delete -> republish -> access flow
- Integration test: duplicate publish without delete raises error
- Unit tests for all new functionality across kernel_runtime and flowfile_core
* Support N inputs per name in kernel execution with read_first convenience method
- Change input_paths from dict[str, str] to dict[str, list[str]] across
  ExecuteRequest models (kernel_runtime and flowfile_core)
- read_input() now scans all paths for a name and concatenates them (union),
  supporting N upstream inputs under the same key (e.g. "main")
- Add read_first() convenience method that reads only input_paths[name][0]
- read_inputs() updated to handle list-based paths
- add_python_script now accepts *flowfile_tables (varargs) and writes each
  input to main_0.parquet, main_1.parquet, etc.
- All existing tests updated to use list-based input_paths format
- New tests: multi-main union, read_first, read_inputs with N paths
* adding multiple paths
* Fix O(N) deletion, deprecated asyncio, naive datetimes, broad exceptions, global context, and hardcoded timeout
- ArtifactContext: add _publisher_index reverse map (kernel_id, name) → node_ids
  so record_deleted and clear_kernel avoid scanning all node states
- Replace asyncio.get_event_loop() with asyncio.get_running_loop() in
  _wait_for_healthy (deprecated since Python 3.10)
- Use datetime.now(timezone.utc) in artifacts.py and models.py instead of
  naive datetime.now()
- Narrow except Exception to specific types: docker.errors.DockerException,
  httpx.HTTPError, OSError, TimeoutError in manager.py
- Add debug logging for health poll failures instead of silent pass
- Replace global _context dict with contextvars.ContextVar in flowfile_client
  for safe concurrent request handling
- Make health timeout configurable via KernelConfig.health_timeout and
  KernelInfo.health_timeout (default 120s), wired through create/start_kernel
* fix binding to input_id
* remove breakpoint
* Preserve artifact state for cached nodes and add multi-input integration tests
Snapshot artifact context before clear_all() in run_graph() and restore
state for nodes that were cached/skipped (their _func never re-executed so
record_published was never called). Also adds two integration tests
exercising multi-input python_script nodes: one using read_input() for
union and one using read_first() for single-input access.
* Allow python_script node to accept multiple main inputs
Change the python_script NodeTemplate input from 1 to 10, matching
polars_code and union nodes. With input=1, add_node_connection always
replaced main_inputs instead of appending, so only the last connection
was retained.
* adding fix
* Scope artifact restore to graph nodes only
The snapshot/restore logic was restoring artifact state for node IDs
that were not part of the graph (e.g. manually injected via
record_published).
* Add Python Script node with kernel and artifact support (#287)
* Add PythonScript node drawer with kernel selection, code editor, and artifacts panel
Implements the frontend drawer UI for the python_script node type:
- Kernel selection dropdown with state indicators and warnings
- CodeMirror editor with Python syntax highlighting and flowfile API autocompletions
- Artifacts panel showing available/published artifacts from kernel
- Help modal documenting the flowfile.* API with examples
- TypeScript types for PythonScriptInput and NodePythonScript
- KernelApi.getArtifacts() method for fetching kernel artifact metadata
* Fix published artifacts matching by using correct field name from kernel API
The kernel's /artifacts endpoint returns `node_id` (not `source_node_id`)
to identify which node published each artifact. Updated the frontend to
read the correct field so published artifacts display properly.
* add translator
* Split artifacts into available (other nodes) vs published (this node)
Available artifacts should only show artifacts from upstream nodes, not
the current node's own publications. Filter by node_id !== currentNodeId
for available, and node_id === currentNodeId for published.
* Add kernel persistence and multi-user access control (#286)
* Persist kernel configurations in database and clean up on shutdown
Kernels are now stored in a `kernels` table (tied to user_id) so they
survive core process restarts.  On startup the KernelManager restores
persisted configs from the DB, then reclaims any running Docker
containers that match; orphan containers with no DB record are stopped.
All kernel REST routes now require authentication and enforce per-user
ownership (list returns only the caller's kernels, mutations check
ownership before proceeding).
On core shutdown (lifespan handler, SIGTERM, SIGINT) every running
kernel container is stopped and removed via `shutdown_all()`.
* Check Docker image availability before starting a kernel
start_kernel now explicitly checks for the flowfile-kernel image
before attempting to run a container, giving a clear error message
("Docker image 'flowfile-kernel' not found. Please build or pull...")
instead of a raw Docker API exception.
* Allocate port lazily in start_kernel for DB-restored kernels
Kernels restored from the database have port=None since ports are
ephemeral and not persisted. start_kernel now calls _allocate_port()
when kernel.port is None, fixing the "Invalid port: 'None'" error
that occurred when starting a kernel after a core restart.
* Add kernel runtime management with Docker containerization (#281) (#290)
* Add flowfile.log() method for real-time log streaming from kernel to frontend
Enable Python script nodes to stream log messages to the FlowFile log
viewer in real time via flowfile.log(). The kernel container makes HTTP
callbacks to the core's /raw_logs endpoint, which writes to the
FlowLogger file. The existing SSE streamer picks up new lines and
pushes them to the frontend immediately.
Changes:
- Add log(), log_info(), log_warning(), log_error() to flowfile_client
- Pass flow_id and log_callback_url through ExecuteRequest to kernel
- Add host.docker.internal mapping to kernel Docker containers
- Update RawLogInput schema to support node_id and WARNING level
- Forward captured stdout/stderr to FlowLogger after execution
* Add kernel runtime versioning visible in frontend
Add __version__ to the kernel_runtime package (0.2.0) and expose it
through the /health endpoint. The KernelManager reads the version when
the container becomes healthy and stores it on KernelInfo. The frontend
KernelCard displays the version badge next to the kernel ID so users
can verify which image version a running kernel is using.
* Implement selective artifact clearing for incremental flow execution (#291)
* Fix artifact loss in debug mode by implementing selective clearing
Previously, run_graph() cleared ALL artifacts from both the metadata
tracker and kernel memory before every run.  When a node was skipped
(up-to-date), the metadata was restored from a snapshot but the actual
Python objects in kernel memory were already gone.  Downstream nodes
that depended on those artifacts would fail with KeyError.
The fix introduces artifact ownership tracking so that only artifacts
from nodes that will actually re-execute are cleared:
- ArtifactStore: add clear_by_node_ids() and list_by_node_id()
- Kernel runtime: add POST /clear_node_artifacts and GET /artifacts/node/{id}
- KernelManager: add clear_node_artifacts_sync() and get_node_artifacts()
- ArtifactContext: add clear_nodes() for selective metadata clearing
- Kernel routes: add /clear_node_artifacts and /artifacts/node/{id} endpoints
- flow_graph.run_graph(): compute execution plan first, determine which
  python_script nodes will re-run, and only clear those nodes' artifacts.
  Skipped nodes keep their artifacts in both metadata and kernel memory.
* Add integration tests for debug mode artifact persistence
Tests verify that artifacts survive re-runs when producing nodes are
skipped (up-to-date) and only consuming nodes re-execute, covering the
core bug scenario, multiple artifacts, and producer re-run clearing.
* Auto-clear node's own artifacts before re-execution in /execute
When a node re-executes (e.g., forced refresh, performance mode re-run),
its previously published artifacts are now automatically cleared before
the new code runs. This prevents "Artifact already exists" errors without
requiring manual delete_artifact() calls in user code.
The clearing is scoped to the executing node's own artifacts only —
artifacts from other nodes are untouched.
* Scope artifacts by flow_id so multiple flows sharing a kernel are isolated
The artifact store now keys artifacts by (flow_id, name) instead of just
name. Two flows using the same kernel can each publish an artifact called
"model" without colliding. All artifact operations (publish, read, delete,
list, clear) are flow-scoped transparently via the execution context.
* fixing issue in index.ts
* Fix artifact not found on re-run when consumer deletes artifact (#294)
When a python_script node deletes an artifact (via delete_artifact) and
is later re-executed (e.g. after a code change), the upstream producer
node was not being re-run. This meant the deleted artifact was
permanently lost from the kernel's in-memory store, causing a KeyError
on the consumer's read_artifact call.
The fix tracks which node originally published each deleted artifact
(_deletion_origins in ArtifactContext). During the pre-execution phase
in run_graph, if a re-running node previously deleted artifacts, the
original producer nodes are added to the re-run set and their execution
state is marked stale so they actually re-execute and republish.
* Add catalog service layer with repository pattern (#298)
* Implement service layer for Flow Catalog system
Extract business logic from route handlers into a proper layered
architecture:
- catalog/exceptions.py: Domain-specific exceptions (CatalogError
  hierarchy) replacing inline HTTPException raises in service code
- catalog/repository.py: CatalogRepository Protocol + SQLAlchemy
  implementation abstracting all data access
- catalog/service.py: CatalogService class owning all business logic
  (validation, enrichment, authorization checks)
- catalog/__init__.py: Public package interface
Refactor routes/catalog.py into a thin HTTP adapter that injects
CatalogService via FastAPI Depends, delegates to service methods,
and translates domain exceptions to HTTP responses.
All 33 existing catalog API tests pass with no behavior changes.
* Address performance and observability concerns
1. Fix N+1 queries in flow listing (4×N → 3 queries):
   - Add bulk_get_favorite_flow_ids, bulk_get_follow_flow_ids,
     bulk_get_run_stats to CatalogRepository
   - Add _bulk_enrich_flows to CatalogService
   - Update list_flows, get_namespace_tree, list_favorites,
     list_following, get_catalog_stats to use bulk enrichment
2. Add tech debt comment for ArtifactStore memory pattern:
   - Document the in-memory storage limitation for large artifacts
   - Suggest future improvements (spill-to-disk, external store)
3. Promote _auto_register_flow logging from debug to info:
   - Users can now see why flows don't appear in catalog
   - Log success and specific failure reasons
4. Improve _run_and_track error handling:
   - Use ERROR level for DB persistence failures
   - Add tracking_succeeded flag with explicit failure message
   - Log successful tracking with run details
   - Add context about flow status in error messages
* Add artifact visualization with edges and node badges (#288)
* Add synchronous kernel management and auto-restart functionality (#296)
* Auto-restart stopped/errored kernels on execution instead of raising
When a kernel is in STOPPED or ERROR state and an operation (execute,
clear_artifacts, etc.) is attempted, the KernelManager now automatically
restarts the kernel container instead of raising a RuntimeError. This
handles the common case where a kernel was restored from the database
after a server restart but its container is no longer running.
Changes:
- Add start_kernel_sync() and _wait_for_healthy_sync() for sync callers
- Add _ensure_running() / _ensure_running_sync() helpers that restart
  STOPPED/ERROR kernels and wait for STARTING kernels
- Replace RuntimeError raises in execute, execute_sync, clear_artifacts,
  clear_node_artifacts, and get_node_artifacts with auto-restart calls
* adding stream logs
* adding flow logger
* Pass flow_logger through _ensure_running_sync to start_kernel_sync
- _ensure_running_sync now logs restart attempt to flow_logger
- Passes flow_logger to start_kernel_sync so users see kernel
  restart progress in the flow execution log
- Fix bug: error handler was calling logger.error instead of
  flow_logger.error
* Pass flow_logger to clear_node_artifacts_sync in flow_graph
* Add tests for kernel auto-restart on stopped/errored state
New TestKernelAutoRestart class with 4 tests:
- test_execute_sync_restarts_stopped_kernel
- test_execute_async_restarts_stopped_kernel
- test_clear_node_artifacts_restarts_stopped_kernel
- test_python_script_node_with_stopped_kernel
Each test stops the kernel, then verifies that the operation
auto-restarts it instead of raising RuntimeError.
* fix ref to python image
* adding python image
* fixing img
* Add comprehensive README for kernel_runtime (#301)
Document how to build and run the Docker image, API endpoints,
the flowfile module usage for data I/O and artifact management,
and development setup instructions.
* Fix parquet corruption race condition in kernel execution (#302)
Add fsync calls after writing parquet files to ensure they are fully
flushed to disk before being read. This prevents "File must end with
PAR1" errors that occur when the kernel reads input files or the host
reads output files before the PAR1 footer is fully written.
The issue occurs because write_parquet() may leave data in OS buffers,
and when sharing files between host and Docker container via mounted
volumes, the reader can see an incomplete file.
* Add artifact persistence and recovery system to kernel runtime (#299)
* Add interactive display outputs for notebook-like cell execution (#303)
* Add display output support for rich notebook-like rendering
Backend changes:
- Add flowfile.display() function to flowfile_client.py that supports
  matplotlib figures, plotly figures, PIL images, HTML strings, and plain text
- Add DisplayOutput model to ExecuteResponse with mime_type, data, and title
- Patch matplotlib.pyplot.show() to auto-capture figures as display outputs
- Add _maybe_wrap_last_expression() for interactive mode auto-display
- Add interactive flag to ExecuteRequest for cell execution mode
- Add /execute_cell endpoint that enables interactive mode
Frontend changes:
- Add DisplayOutput and ExecuteResult interfaces to kernel.types.ts
- Add executeCell() method to KernelApi class
- Add display() completion to flowfileCompletions.ts
Tests:
- Add comprehensive tests for display(), _reset_displays(), _get_displays()
- Add tests for display output in execute endpoint
- Add tests for interactive mode auto-display behavior
* Fix review issues in display output support
- Add _displays.set([]) to _clear_context() for consistent cleanup
- Fix _is_html_string false-positives by using regex to detect actual HTML tags
- Export DisplayOutput from flowfile_core/kernel/__init__.py
- Change base64 decode to use ascii instead of utf-8
- Simplify _maybe_wrap_last_expression using ast.get_source_segment
- Add tests for HTML false-positive cases (e.g., "x < 10 and y > 5")
* Polish PythonScript.vue styling for more compact layout (#306)
- Reduce spacing: settings gap (0.65rem), block gap (0.3rem), label font-size (0.8rem)
- Make kernel selector compact with size="small" and smaller warning box
- Improve editor container with subtle inner shadow and tighter border-radius
- Make artifacts panel lighter with transparent bg and top border only
- Move help button inline with Code label to save vertical space
* Change read_inputs() to return list of LazyFrames per input (#309)
* Fix read_inputs() to return list of LazyFrames for multiple inputs
Previously, when multiple nodes were connected to a python_script node,
read_inputs() would concatenate all inputs into a single LazyFrame,
making it impossible to distinguish between different input sources.
Now read_inputs() returns dict[str, list[pl.LazyFrame]] where each
entry is a list of LazyFrames, one per connected input.
* Update help and autocomplete docs for read_inputs() return type
- FlowfileApiHelp.vue: Updated description and example to show
  that read_inputs() returns dict of LazyFrame lists
- flowfileCompletions.ts: Updated info and detail to reflect
  the new return type signature
* Fix tests for read_inputs() returning list of LazyFrames
Updated tests to expect dict[str, list[LazyFrame]] return type:
- test_read_inputs_returns_dict: check for list with LazyFrame element
- test_multiple_named_inputs: access inputs via [0] index
- test_read_inputs_with_multiple_main_paths: verify list length and values
- test_multiple_inputs: access inputs via [0] index in code string
* Fix test_multiple_inputs in test_kernel_integration.py
Updated code string to access read_inputs() results via [0] index
since it now returns dict[str, list[LazyFrame]].
* Add comprehensive CLAUDE.md for AI-assisted development (#311)
* Global sharing of artifacts (#304)
* Add schema_callback to python_script node to fix slow editor opening (#316)
When clicking a python_script node before running the flow, the editor
took ~17s because get_predicted_schema fell through to execution,
triggering upstream pipeline runs and kernel container startup. Add a
schema_callback that returns the input node schema as a best-effort
prediction, matching the pattern used by other node types like output.
* Link global artifacts to flow registrations (#312)
* Link global artifacts to registered catalog flows
Every global artifact now requires a source_registration_id that ties it
to a registered catalog flow. The artifact inherits the flow's
namespace_id by default, with the option to override explicitly.
Key changes:
- Add source_registration_id FK column to GlobalArtifact model
- Make source_registration_id required in PrepareUploadRequest schema
- Validate registration exists and inherit namespace_id in artifact service
- Block flow deletion when active (non-deleted) artifacts reference it
- Add FlowHasArtifactsError exception for cascade protection
- Pass source_registration_id through kernel execution context
- Hard-delete soft-deleted artifacts when their flow is deleted
- Add comprehensive tests for all new behaviors (38 tests pass)
* Fix kernel_runtime publish_global tests: add source_registration_id context
The publish_global function now requires source_registration_id in the
execution context. Add autouse fixtures to TestPublishGlobal and
TestGlobalArtifactIntegration that set up the flowfile context with
source_registration_id before each test.
* Fix kernel integration tests: pass source_registration_id in ExecuteRequest
Add test_registration fixture that creates a FlowRegistration in the DB,
and pass its ID through ExecuteRequest and _create_graph for all tests
that call publish_global. This satisfies the required source_registration_id
validation in both the kernel context and Core API.
* Add Jupyter-style notebook editor for Python Script nodes (#308)
* Add artifact management and display to catalog system (#317)
* Fix artifact availability to only show upstream DAG ancestors (#320)
The PythonScript node's artifact panel was showing all artifacts from the
kernel regardless of DAG structure. When two independent chains shared a
kernel, artifacts from one chain would incorrectly appear as "available"
in the other chain.
The fix adds a backend endpoint that exposes upstream node IDs via DAG
traversal, and updates the frontend to filter kernel artifacts using
this DAG-aware set instead of showing all non-self artifacts.
- Backend: Add GET /flow/node_upstream_ids endpoint
- Frontend: FlowApi.getNodeUpstreamIds() fetches upstream IDs
- Frontend: PythonScript.vue filters artifacts by upstream set
- Tests: Add chain isolation tests for ArtifactContext
- Tests: Add endpoint test for node_upstream_ids
* Fix cursor styling in CodeMirror editor (#321)
* Refactor path resolution and add Docker integration tests (#323)
* Offload collect() in add_python_script from core to worker (#327)
The add_python_script method was calling collect().write_parquet() directly
on the core process, which is undesirable for performance. This change
offloads the collect and parquet writing to the worker process using the
existing ExternalDfFetcher infrastructure.
Changes:
- Add write_parquet operation to worker funcs.py that deserializes a
  LazyFrame, collects it, and writes to a specified parquet path with fsync
- Add write_parquet to OperationType in both worker and core models
- Add kwargs support to ExternalDfFetcher and trigger_df_operation so
  custom parameters (like output_path) can be passed through both
  WebSocket streaming and REST fallback paths
- Update REST /submit_query/ endpoint to read kwargs from X-Kwargs header
- Replace direct collect().write_parquet() in add_python_script with
  ExternalDfFetcher using the new write_parquet operation type
* Add interactive mode support to publish_global (#328)
* Fix Docker Kernel E2E test collection failure (#330)
Scope pytest collection to tests/integration/ to avoid importing
conftest.py from flowfile_core/tests, flowfile_frame/tests, and
flowfile_worker/tests, which fail with ModuleNotFoundError due to
ambiguous 'tests' package names. Also remove a stray breakpoint()
that would hang CI.
* Add kernel memory monitoring and OOM detection (#331)
* Add kernel memory usage display and OOM detection
Display live container memory usage near the kernel selector in the
Python Script node and in the Kernel Manager cards. The kernel runtime
reads cgroup v1/v2 memory stats, flowfile_core proxies them, and the
frontend polls every 3 seconds with color-coded warnings at 80% (yellow)
and 95% (red). When a container is OOM-killed during execution, the
error is detected via Docker inspect and surfaced as a clear
"Kernel ran out of memory" message instead of a generic connection error.
* Fix 500 errors on kernel memory stats endpoint
Catch httpx/OS errors in get_memory_stats and convert them to
RuntimeError so the route handler returns a proper 400 instead of
an unhandled 500. This prevents console spam when the kernel runtime
container hasn't been rebuilt with the /memory endpoint yet.
* Improve error handling for missing upstream inputs in flowfile_client (#333)
* Fix source_registration_id lost during flow restore (#334)
When a flow is restored (undo/redo or loaded from YAML/JSON), the
source_registration_id was not included in the FlowfileSettings
serialization schema. This caused publish_global to fail with
"source_registration_id is required" because the kernel received
None instead of the catalog registration ID.
Changes:
- Add source_registration_id to FlowfileSettings schema
- Include source_registration_id in get_flowfile_data() serialization
- Include source_registration_id in _flowfile_data_to_flow_information()
  deserialization
- Preserve source_registration_id in restore_from_snapshot() so undo/redo
  doesn't lose it even when the snapshot predates the registration
* Add Pydantic schemas for artifact metadata responses (#335)
* Fix backward compat for source_registration_id in legacy pickle files (#337)
* Add shared_location() API for accessing shared directory files (#329)
* Display flow run outputs in Python script node editor (#340)
* Add kernel execution cancellation via SIGUSR1 signal (#339)
* Add kernel execution support to custom node designer (#342)
* Handle unsafe path injection sql
* fix linting issues in vue code
* adding kernel architecture
* Adding kernel docs
* Add improved limitation
* ensure that sql source test does not fail because of higher abstraction in execution
* revert name change in tabel
* remove breakpoints
* Removing docker check from electron app
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants