Add kernel persistence and multi-user access control by Edwardvaneechoud · Pull Request #286 · Edwardvaneechoud/Flowfile

Edwardvaneechoud · 2026-02-01T08:54:08Z

Summary

This PR adds database persistence for kernel configurations and implements multi-user access control for the kernel API. Kernels now survive core process restarts, and users can only access their own kernels.

Key Changes

Database & Persistence

Added Kernel model to store kernel configurations (id, name, packages, resource limits, user ownership)
Created flowfile_core/kernel/persistence.py module with CRUD operations for kernel persistence
Kernels are saved to the database when created and restored on startup
Only configuration is persisted; runtime state (container_id, port) is ephemeral and reconstructed by reclaiming running Docker containers

Kernel Manager Updates

Added _kernel_owners dict to track which user owns each kernel
Implemented _restore_kernels_from_db() to load persisted kernels on startup
Added _persist_kernel() and _remove_kernel_from_db() helper methods
Updated create_kernel() to accept user_id parameter and persist the kernel
Updated delete_kernel() to remove kernel from database
Added shutdown_all() method to gracefully stop all running containers during core shutdown
Improved _reclaim_running_containers() to handle orphan containers (stops containers with no DB record)
Added get_kernel_owner() method to retrieve kernel ownership
Updated list_kernels() to optionally filter by user_id

API Authorization

Added authentication dependency to all kernel routes using get_current_active_user
Implemented ownership checks on all kernel endpoints (get, delete, start, stop, execute, artifacts, clear)
Returns 403 Forbidden when users attempt to access kernels they don't own
Updated list_kernels() endpoint to return only the current user's kernels

Graceful Shutdown

Added _shutdown_kernels() function called during application shutdown
Ensures all running kernel containers are properly stopped when the core service terminates

Testing

Updated kernel fixtures to pass user_id=1 when creating test kernels

Implementation Details

Kernel packages are stored as JSON-serialized strings in the database
Database operations are wrapped in try-except blocks to prevent persistence failures from blocking kernel operations
Orphan containers (running but not in database) are automatically cleaned up during startup
The Float type was added to SQLAlchemy imports to support CPU and memory resource specifications

Kernels are now stored in a `kernels` table (tied to user_id) so they survive core process restarts. On startup the KernelManager restores persisted configs from the DB, then reclaims any running Docker containers that match; orphan containers with no DB record are stopped. All kernel REST routes now require authentication and enforce per-user ownership (list returns only the caller's kernels, mutations check ownership before proceeding). On core shutdown (lifespan handler, SIGTERM, SIGINT) every running kernel container is stopped and removed via `shutdown_all()`. https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC

Resolve import conflict in routes.py: combine auth (Depends, get_current_active_user) with logging and DockerStatus from the target branch. https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC

start_kernel now explicitly checks for the flowfile-kernel image before attempting to run a container, giving a clear error message ("Docker image 'flowfile-kernel' not found. Please build or pull...") instead of a raw Docker API exception. https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC

Kernels restored from the database have port=None since ports are ephemeral and not persisted. start_kernel now calls _allocate_port() when kernel.port is None, fixing the "Invalid port: 'None'" error that occurred when starting a kernel after a core restart. https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC

* Add kernel runtime management with Docker containerization (#281) * Add Docker-based kernel system for isolated Python code execution Introduces two components: - kernel_runtime/: Standalone FastAPI service that runs inside Docker containers, providing code execution with artifact storage and parquet-based data I/O via the flowfile client API - flowfile_core/kernel/: Orchestration layer that manages kernel containers (create, start, stop, delete, execute) using docker-py, with full REST API routes integrated into the core backend * Add python_script node type for kernel-based code execution - PythonScriptInput/NodePythonScript schemas in input_schema.py - add_python_script method in flow_graph.py that stages input parquet to shared volume, executes on kernel, reads output back - get_kernel_manager singleton in kernel/__init__.py - python_script node template registered in node_store * Add integration tests for Docker-based kernel system - kernel_fixtures.py: builds the flowfile-kernel Docker image, creates a KernelManager with a temp shared volume, starts a container, and tears everything down via a managed_kernel() context manager - conftest.py: adds session-scoped kernel_manager fixture - test_kernel_integration.py: full integration tests covering: - TestKernelRuntime: health check, stdout/stderr capture, syntax errors, artifact publish/list, parquet read/write round-trip, multiple named inputs, execution timing - TestPythonScriptNode: python_script node passthrough and transform via FlowGraph.run_graph(), plus missing kernel_id error handling - manager.py: expose shared_volume_path as public property - flow_graph.py: use public property instead of private attribute * update poetry version * Fix kernel system: singleton routing, state machine, sync execution, port resilience - routes.py: use get_kernel_manager() singleton instead of creating a separate KernelManager instance (was causing dual-state bug) - models.py: replace RUNNING with IDLE/EXECUTING states; store memory_gb, cpu_cores, gpu on KernelInfo from KernelConfig - manager.py: add _reclaim_running_containers() on init to discover existing flowfile-kernel-* containers and reclaim their ports; port allocation now scans for available ports instead of incrementing; add execute_sync() using httpx.Client for clean sync usage; state transitions: IDLE -> EXECUTING -> IDLE during execute() - flow_graph.py: use execute_sync() instead of fragile asyncio.run/ThreadPoolExecutor dance - test: update state assertion from "running" to "idle" * Fix kernel health check and test fixture resilience - _wait_for_healthy: catch all httpx errors (including RemoteProtocolError) during startup polling, not just ConnectError/ReadError/ConnectTimeout - conftest kernel_manager fixture: wrap managed_kernel() in try/except so container start failures produce pytest.skip instead of ERROR * removing breakpoint * Run kernel integration tests in parallel CI worker - Add pytest.mark.kernel marker to test_kernel_integration.py - Register 'kernel' marker in pyproject.toml - Exclude kernel tests from main backend-tests with -m "not kernel" (both Linux and Windows jobs) - Add dedicated kernel-tests job that runs in parallel: builds Docker image, runs -m kernel tests, 15min timeout - Add kernel_runtime paths to change detection filters - Include kernel-tests in test-summary aggregation * Remove --timeout flag from kernel CI step (pytest-timeout not installed) The job-level timeout-minutes: 15 already handles this. * Add unit tests for kernel_runtime (artifact_store, flowfile_client, endpoints) 42 tests covering the three kernel_runtime modules: - artifact_store: publish/get/list/clear, metadata, thread safety - flowfile_client: context management, parquet I/O, artifacts - main.py endpoints: /health, /execute, /artifacts, /clear, parquet round-trips * Add *.egg-info/ to .gitignore * Add kernel_runtime unit tests to CI kernel-tests job Installs kernel_runtime with test deps and runs its 42 unit tests before the Docker-dependent integration tests. * Add Kernel Manager UI for Python execution environments (#282) * Add kernel management UI for Python execution environments Provides a visual interface for managing Docker-based kernel containers used by Python Script nodes. Users can create kernels with custom packages and resource limits, monitor status (stopped/starting/idle/executing/error), and control lifecycle (start/stop/delete) with auto-polling for live updates. * Update package-lock.json version to match package.json * Handle Docker unavailable gracefully with 503 and error banner The kernel routes now catch DockerException during manager init and return a 503 with a clear message instead of crashing with a 500. The frontend surfaces this as a red error banner at the top of the Kernel Manager page so users know Docker needs to be running. * Add /kernels/docker-status endpoint and proactive UI feedback New GET /kernels/docker-status endpoint checks Docker daemon reachability and whether the flowfile-kernel image exists. The UI calls this on page load and shows targeted banners: red for Docker not running, yellow for missing kernel image, so users know exactly what to fix before creating kernels. * Center kernel manager page with margin auto and padding Match the layout pattern used by other views (SecretsView, DatabaseView) with max-width 1200px, margin 0 auto, and standard spacing-5 padding. * Add artifact context tracking for python_script nodes (#283) * Add ArtifactContext for tracking artifact metadata across FlowGraph Introduces an ArtifactContext class that tracks which Python artifacts are published and consumed by python_script nodes, enabling visibility into artifact availability based on graph topology and kernel isolation. - Create artifacts.py with ArtifactRef, NodeArtifactState, ArtifactContext - Integrate ArtifactContext into FlowGraph.__init__ - Add _get_upstream_node_ids and _get_required_kernel_ids helpers - Clear artifact context at flow start in run_graph() - Compute available artifacts before and record published after execution - Add clear_artifacts_sync to KernelManager for non-async clearing - Add 32 unit tests for ArtifactContext (test_artifact_context.py) - Add 7 FlowGraph integration tests (test_flowfile.py) - Add 5 kernel integration tests (test_kernel_integration.py) * Add delete_artifact support, duplicate publish prevention, and model training integration test - ArtifactStore.publish() now raises ValueError if artifact name already exists - Added ArtifactStore.delete() and flowfile_client.delete_artifact() - ExecuteResult/ExecuteResponse track artifacts_deleted alongside artifacts_published - ArtifactContext.record_deleted() removes artifacts from kernel index and published lists - flow_graph.add_python_script records deletions from execution results - Integration test: train numpy linear regression in node A, apply predictions in node B - Integration test: publish -> use & delete -> republish -> access flow - Integration test: duplicate publish without delete raises error - Unit tests for all new functionality across kernel_runtime and flowfile_core * Support N inputs per name in kernel execution with read_first convenience method - Change input_paths from dict[str, str] to dict[str, list[str]] across ExecuteRequest models (kernel_runtime and flowfile_core) - read_input() now scans all paths for a name and concatenates them (union), supporting N upstream inputs under the same key (e.g. "main") - Add read_first() convenience method that reads only input_paths[name][0] - read_inputs() updated to handle list-based paths - add_python_script now accepts *flowfile_tables (varargs) and writes each input to main_0.parquet, main_1.parquet, etc. - All existing tests updated to use list-based input_paths format - New tests: multi-main union, read_first, read_inputs with N paths * adding multiple paths * Fix O(N) deletion, deprecated asyncio, naive datetimes, broad exceptions, global context, and hardcoded timeout - ArtifactContext: add _publisher_index reverse map (kernel_id, name) → node_ids so record_deleted and clear_kernel avoid scanning all node states - Replace asyncio.get_event_loop() with asyncio.get_running_loop() in _wait_for_healthy (deprecated since Python 3.10) - Use datetime.now(timezone.utc) in artifacts.py and models.py instead of naive datetime.now() - Narrow except Exception to specific types: docker.errors.DockerException, httpx.HTTPError, OSError, TimeoutError in manager.py - Add debug logging for health poll failures instead of silent pass - Replace global _context dict with contextvars.ContextVar in flowfile_client for safe concurrent request handling - Make health timeout configurable via KernelConfig.health_timeout and KernelInfo.health_timeout (default 120s), wired through create/start_kernel * fix binding to input_id * remove breakpoint * Preserve artifact state for cached nodes and add multi-input integration tests Snapshot artifact context before clear_all() in run_graph() and restore state for nodes that were cached/skipped (their _func never re-executed so record_published was never called). Also adds two integration tests exercising multi-input python_script nodes: one using read_input() for union and one using read_first() for single-input access. * Allow python_script node to accept multiple main inputs Change the python_script NodeTemplate input from 1 to 10, matching polars_code and union nodes. With input=1, add_node_connection always replaced main_inputs instead of appending, so only the last connection was retained. * adding fix * Scope artifact restore to graph nodes only The snapshot/restore logic was restoring artifact state for node IDs that were not part of the graph (e.g. manually injected via record_published). * Add Python Script node with kernel and artifact support (#287) * Add PythonScript node drawer with kernel selection, code editor, and artifacts panel Implements the frontend drawer UI for the python_script node type: - Kernel selection dropdown with state indicators and warnings - CodeMirror editor with Python syntax highlighting and flowfile API autocompletions - Artifacts panel showing available/published artifacts from kernel - Help modal documenting the flowfile.* API with examples - TypeScript types for PythonScriptInput and NodePythonScript - KernelApi.getArtifacts() method for fetching kernel artifact metadata * Fix published artifacts matching by using correct field name from kernel API The kernel's /artifacts endpoint returns `node_id` (not `source_node_id`) to identify which node published each artifact. Updated the frontend to read the correct field so published artifacts display properly. * add translator * Split artifacts into available (other nodes) vs published (this node) Available artifacts should only show artifacts from upstream nodes, not the current node's own publications. Filter by node_id !== currentNodeId for available, and node_id === currentNodeId for published. * Add kernel persistence and multi-user access control (#286) * Persist kernel configurations in database and clean up on shutdown Kernels are now stored in a `kernels` table (tied to user_id) so they survive core process restarts. On startup the KernelManager restores persisted configs from the DB, then reclaims any running Docker containers that match; orphan containers with no DB record are stopped. All kernel REST routes now require authentication and enforce per-user ownership (list returns only the caller's kernels, mutations check ownership before proceeding). On core shutdown (lifespan handler, SIGTERM, SIGINT) every running kernel container is stopped and removed via `shutdown_all()`. * Check Docker image availability before starting a kernel start_kernel now explicitly checks for the flowfile-kernel image before attempting to run a container, giving a clear error message ("Docker image 'flowfile-kernel' not found. Please build or pull...") instead of a raw Docker API exception. * Allocate port lazily in start_kernel for DB-restored kernels Kernels restored from the database have port=None since ports are ephemeral and not persisted. start_kernel now calls _allocate_port() when kernel.port is None, fixing the "Invalid port: 'None'" error that occurred when starting a kernel after a core restart. * Add kernel runtime management with Docker containerization (#281) (#290) * Add flowfile.log() method for real-time log streaming from kernel to frontend Enable Python script nodes to stream log messages to the FlowFile log viewer in real time via flowfile.log(). The kernel container makes HTTP callbacks to the core's /raw_logs endpoint, which writes to the FlowLogger file. The existing SSE streamer picks up new lines and pushes them to the frontend immediately. Changes: - Add log(), log_info(), log_warning(), log_error() to flowfile_client - Pass flow_id and log_callback_url through ExecuteRequest to kernel - Add host.docker.internal mapping to kernel Docker containers - Update RawLogInput schema to support node_id and WARNING level - Forward captured stdout/stderr to FlowLogger after execution * Add kernel runtime versioning visible in frontend Add __version__ to the kernel_runtime package (0.2.0) and expose it through the /health endpoint. The KernelManager reads the version when the container becomes healthy and stores it on KernelInfo. The frontend KernelCard displays the version badge next to the kernel ID so users can verify which image version a running kernel is using. * Implement selective artifact clearing for incremental flow execution (#291) * Fix artifact loss in debug mode by implementing selective clearing Previously, run_graph() cleared ALL artifacts from both the metadata tracker and kernel memory before every run. When a node was skipped (up-to-date), the metadata was restored from a snapshot but the actual Python objects in kernel memory were already gone. Downstream nodes that depended on those artifacts would fail with KeyError. The fix introduces artifact ownership tracking so that only artifacts from nodes that will actually re-execute are cleared: - ArtifactStore: add clear_by_node_ids() and list_by_node_id() - Kernel runtime: add POST /clear_node_artifacts and GET /artifacts/node/{id} - KernelManager: add clear_node_artifacts_sync() and get_node_artifacts() - ArtifactContext: add clear_nodes() for selective metadata clearing - Kernel routes: add /clear_node_artifacts and /artifacts/node/{id} endpoints - flow_graph.run_graph(): compute execution plan first, determine which python_script nodes will re-run, and only clear those nodes' artifacts. Skipped nodes keep their artifacts in both metadata and kernel memory. * Add integration tests for debug mode artifact persistence Tests verify that artifacts survive re-runs when producing nodes are skipped (up-to-date) and only consuming nodes re-execute, covering the core bug scenario, multiple artifacts, and producer re-run clearing. * Auto-clear node's own artifacts before re-execution in /execute When a node re-executes (e.g., forced refresh, performance mode re-run), its previously published artifacts are now automatically cleared before the new code runs. This prevents "Artifact already exists" errors without requiring manual delete_artifact() calls in user code. The clearing is scoped to the executing node's own artifacts only — artifacts from other nodes are untouched. * Scope artifacts by flow_id so multiple flows sharing a kernel are isolated The artifact store now keys artifacts by (flow_id, name) instead of just name. Two flows using the same kernel can each publish an artifact called "model" without colliding. All artifact operations (publish, read, delete, list, clear) are flow-scoped transparently via the execution context. * fixing issue in index.ts * Fix artifact not found on re-run when consumer deletes artifact (#294) When a python_script node deletes an artifact (via delete_artifact) and is later re-executed (e.g. after a code change), the upstream producer node was not being re-run. This meant the deleted artifact was permanently lost from the kernel's in-memory store, causing a KeyError on the consumer's read_artifact call. The fix tracks which node originally published each deleted artifact (_deletion_origins in ArtifactContext). During the pre-execution phase in run_graph, if a re-running node previously deleted artifacts, the original producer nodes are added to the re-run set and their execution state is marked stale so they actually re-execute and republish. * Add catalog service layer with repository pattern (#298) * Implement service layer for Flow Catalog system Extract business logic from route handlers into a proper layered architecture: - catalog/exceptions.py: Domain-specific exceptions (CatalogError hierarchy) replacing inline HTTPException raises in service code - catalog/repository.py: CatalogRepository Protocol + SQLAlchemy implementation abstracting all data access - catalog/service.py: CatalogService class owning all business logic (validation, enrichment, authorization checks) - catalog/__init__.py: Public package interface Refactor routes/catalog.py into a thin HTTP adapter that injects CatalogService via FastAPI Depends, delegates to service methods, and translates domain exceptions to HTTP responses. All 33 existing catalog API tests pass with no behavior changes. * Address performance and observability concerns 1. Fix N+1 queries in flow listing (4×N → 3 queries): - Add bulk_get_favorite_flow_ids, bulk_get_follow_flow_ids, bulk_get_run_stats to CatalogRepository - Add _bulk_enrich_flows to CatalogService - Update list_flows, get_namespace_tree, list_favorites, list_following, get_catalog_stats to use bulk enrichment 2. Add tech debt comment for ArtifactStore memory pattern: - Document the in-memory storage limitation for large artifacts - Suggest future improvements (spill-to-disk, external store) 3. Promote _auto_register_flow logging from debug to info: - Users can now see why flows don't appear in catalog - Log success and specific failure reasons 4. Improve _run_and_track error handling: - Use ERROR level for DB persistence failures - Add tracking_succeeded flag with explicit failure message - Log successful tracking with run details - Add context about flow status in error messages * Add artifact visualization with edges and node badges (#288) * Add synchronous kernel management and auto-restart functionality (#296) * Auto-restart stopped/errored kernels on execution instead of raising When a kernel is in STOPPED or ERROR state and an operation (execute, clear_artifacts, etc.) is attempted, the KernelManager now automatically restarts the kernel container instead of raising a RuntimeError. This handles the common case where a kernel was restored from the database after a server restart but its container is no longer running. Changes: - Add start_kernel_sync() and _wait_for_healthy_sync() for sync callers - Add _ensure_running() / _ensure_running_sync() helpers that restart STOPPED/ERROR kernels and wait for STARTING kernels - Replace RuntimeError raises in execute, execute_sync, clear_artifacts, clear_node_artifacts, and get_node_artifacts with auto-restart calls * adding stream logs * adding flow logger * Pass flow_logger through _ensure_running_sync to start_kernel_sync - _ensure_running_sync now logs restart attempt to flow_logger - Passes flow_logger to start_kernel_sync so users see kernel restart progress in the flow execution log - Fix bug: error handler was calling logger.error instead of flow_logger.error * Pass flow_logger to clear_node_artifacts_sync in flow_graph * Add tests for kernel auto-restart on stopped/errored state New TestKernelAutoRestart class with 4 tests: - test_execute_sync_restarts_stopped_kernel - test_execute_async_restarts_stopped_kernel - test_clear_node_artifacts_restarts_stopped_kernel - test_python_script_node_with_stopped_kernel Each test stops the kernel, then verifies that the operation auto-restarts it instead of raising RuntimeError. * fix ref to python image * adding python image * fixing img * Add comprehensive README for kernel_runtime (#301) Document how to build and run the Docker image, API endpoints, the flowfile module usage for data I/O and artifact management, and development setup instructions. * Fix parquet corruption race condition in kernel execution (#302) Add fsync calls after writing parquet files to ensure they are fully flushed to disk before being read. This prevents "File must end with PAR1" errors that occur when the kernel reads input files or the host reads output files before the PAR1 footer is fully written. The issue occurs because write_parquet() may leave data in OS buffers, and when sharing files between host and Docker container via mounted volumes, the reader can see an incomplete file. * Add artifact persistence and recovery system to kernel runtime (#299) * Add interactive display outputs for notebook-like cell execution (#303) * Add display output support for rich notebook-like rendering Backend changes: - Add flowfile.display() function to flowfile_client.py that supports matplotlib figures, plotly figures, PIL images, HTML strings, and plain text - Add DisplayOutput model to ExecuteResponse with mime_type, data, and title - Patch matplotlib.pyplot.show() to auto-capture figures as display outputs - Add _maybe_wrap_last_expression() for interactive mode auto-display - Add interactive flag to ExecuteRequest for cell execution mode - Add /execute_cell endpoint that enables interactive mode Frontend changes: - Add DisplayOutput and ExecuteResult interfaces to kernel.types.ts - Add executeCell() method to KernelApi class - Add display() completion to flowfileCompletions.ts Tests: - Add comprehensive tests for display(), _reset_displays(), _get_displays() - Add tests for display output in execute endpoint - Add tests for interactive mode auto-display behavior * Fix review issues in display output support - Add _displays.set([]) to _clear_context() for consistent cleanup - Fix _is_html_string false-positives by using regex to detect actual HTML tags - Export DisplayOutput from flowfile_core/kernel/__init__.py - Change base64 decode to use ascii instead of utf-8 - Simplify _maybe_wrap_last_expression using ast.get_source_segment - Add tests for HTML false-positive cases (e.g., "x < 10 and y > 5") * Polish PythonScript.vue styling for more compact layout (#306) - Reduce spacing: settings gap (0.65rem), block gap (0.3rem), label font-size (0.8rem) - Make kernel selector compact with size="small" and smaller warning box - Improve editor container with subtle inner shadow and tighter border-radius - Make artifacts panel lighter with transparent bg and top border only - Move help button inline with Code label to save vertical space * Change read_inputs() to return list of LazyFrames per input (#309) * Fix read_inputs() to return list of LazyFrames for multiple inputs Previously, when multiple nodes were connected to a python_script node, read_inputs() would concatenate all inputs into a single LazyFrame, making it impossible to distinguish between different input sources. Now read_inputs() returns dict[str, list[pl.LazyFrame]] where each entry is a list of LazyFrames, one per connected input. * Update help and autocomplete docs for read_inputs() return type - FlowfileApiHelp.vue: Updated description and example to show that read_inputs() returns dict of LazyFrame lists - flowfileCompletions.ts: Updated info and detail to reflect the new return type signature * Fix tests for read_inputs() returning list of LazyFrames Updated tests to expect dict[str, list[LazyFrame]] return type: - test_read_inputs_returns_dict: check for list with LazyFrame element - test_multiple_named_inputs: access inputs via [0] index - test_read_inputs_with_multiple_main_paths: verify list length and values - test_multiple_inputs: access inputs via [0] index in code string * Fix test_multiple_inputs in test_kernel_integration.py Updated code string to access read_inputs() results via [0] index since it now returns dict[str, list[LazyFrame]]. * Add comprehensive CLAUDE.md for AI-assisted development (#311) * Global sharing of artifacts (#304) * Add schema_callback to python_script node to fix slow editor opening (#316) When clicking a python_script node before running the flow, the editor took ~17s because get_predicted_schema fell through to execution, triggering upstream pipeline runs and kernel container startup. Add a schema_callback that returns the input node schema as a best-effort prediction, matching the pattern used by other node types like output. * Link global artifacts to flow registrations (#312) * Link global artifacts to registered catalog flows Every global artifact now requires a source_registration_id that ties it to a registered catalog flow. The artifact inherits the flow's namespace_id by default, with the option to override explicitly. Key changes: - Add source_registration_id FK column to GlobalArtifact model - Make source_registration_id required in PrepareUploadRequest schema - Validate registration exists and inherit namespace_id in artifact service - Block flow deletion when active (non-deleted) artifacts reference it - Add FlowHasArtifactsError exception for cascade protection - Pass source_registration_id through kernel execution context - Hard-delete soft-deleted artifacts when their flow is deleted - Add comprehensive tests for all new behaviors (38 tests pass) * Fix kernel_runtime publish_global tests: add source_registration_id context The publish_global function now requires source_registration_id in the execution context. Add autouse fixtures to TestPublishGlobal and TestGlobalArtifactIntegration that set up the flowfile context with source_registration_id before each test. * Fix kernel integration tests: pass source_registration_id in ExecuteRequest Add test_registration fixture that creates a FlowRegistration in the DB, and pass its ID through ExecuteRequest and _create_graph for all tests that call publish_global. This satisfies the required source_registration_id validation in both the kernel context and Core API. * Add Jupyter-style notebook editor for Python Script nodes (#308) * Add artifact management and display to catalog system (#317) * Fix artifact availability to only show upstream DAG ancestors (#320) The PythonScript node's artifact panel was showing all artifacts from the kernel regardless of DAG structure. When two independent chains shared a kernel, artifacts from one chain would incorrectly appear as "available" in the other chain. The fix adds a backend endpoint that exposes upstream node IDs via DAG traversal, and updates the frontend to filter kernel artifacts using this DAG-aware set instead of showing all non-self artifacts. - Backend: Add GET /flow/node_upstream_ids endpoint - Frontend: FlowApi.getNodeUpstreamIds() fetches upstream IDs - Frontend: PythonScript.vue filters artifacts by upstream set - Tests: Add chain isolation tests for ArtifactContext - Tests: Add endpoint test for node_upstream_ids * Fix cursor styling in CodeMirror editor (#321) * Refactor path resolution and add Docker integration tests (#323) * Offload collect() in add_python_script from core to worker (#327) The add_python_script method was calling collect().write_parquet() directly on the core process, which is undesirable for performance. This change offloads the collect and parquet writing to the worker process using the existing ExternalDfFetcher infrastructure. Changes: - Add write_parquet operation to worker funcs.py that deserializes a LazyFrame, collects it, and writes to a specified parquet path with fsync - Add write_parquet to OperationType in both worker and core models - Add kwargs support to ExternalDfFetcher and trigger_df_operation so custom parameters (like output_path) can be passed through both WebSocket streaming and REST fallback paths - Update REST /submit_query/ endpoint to read kwargs from X-Kwargs header - Replace direct collect().write_parquet() in add_python_script with ExternalDfFetcher using the new write_parquet operation type * Add interactive mode support to publish_global (#328) * Fix Docker Kernel E2E test collection failure (#330) Scope pytest collection to tests/integration/ to avoid importing conftest.py from flowfile_core/tests, flowfile_frame/tests, and flowfile_worker/tests, which fail with ModuleNotFoundError due to ambiguous 'tests' package names. Also remove a stray breakpoint() that would hang CI. * Add kernel memory monitoring and OOM detection (#331) * Add kernel memory usage display and OOM detection Display live container memory usage near the kernel selector in the Python Script node and in the Kernel Manager cards. The kernel runtime reads cgroup v1/v2 memory stats, flowfile_core proxies them, and the frontend polls every 3 seconds with color-coded warnings at 80% (yellow) and 95% (red). When a container is OOM-killed during execution, the error is detected via Docker inspect and surfaced as a clear "Kernel ran out of memory" message instead of a generic connection error. * Fix 500 errors on kernel memory stats endpoint Catch httpx/OS errors in get_memory_stats and convert them to RuntimeError so the route handler returns a proper 400 instead of an unhandled 500. This prevents console spam when the kernel runtime container hasn't been rebuilt with the /memory endpoint yet. * Improve error handling for missing upstream inputs in flowfile_client (#333) * Fix source_registration_id lost during flow restore (#334) When a flow is restored (undo/redo or loaded from YAML/JSON), the source_registration_id was not included in the FlowfileSettings serialization schema. This caused publish_global to fail with "source_registration_id is required" because the kernel received None instead of the catalog registration ID. Changes: - Add source_registration_id to FlowfileSettings schema - Include source_registration_id in get_flowfile_data() serialization - Include source_registration_id in _flowfile_data_to_flow_information() deserialization - Preserve source_registration_id in restore_from_snapshot() so undo/redo doesn't lose it even when the snapshot predates the registration * Add Pydantic schemas for artifact metadata responses (#335) * Fix backward compat for source_registration_id in legacy pickle files (#337) * Add shared_location() API for accessing shared directory files (#329) * Display flow run outputs in Python script node editor (#340) * Add kernel execution cancellation via SIGUSR1 signal (#339) * Add kernel execution support to custom node designer (#342) * Handle unsafe path injection sql * fix linting issues in vue code * adding kernel architecture * Adding kernel docs * Add improved limitation * ensure that sql source test does not fail because of higher abstraction in execution * revert name change in tabel * remove breakpoints * Removing docker check from electron app

claude added 4 commits January 31, 2026 15:46

Merge feauture/kernel-implementation into persist-kernels-database

6ee9fed

Resolve import conflict in routes.py: combine auth (Depends, get_current_active_user) with logging and DockerStatus from the target branch. https://claude.ai/code/session_01PcxZsx9KTQvHLDvzgAUjzC

Edwardvaneechoud merged commit 9ee26f6 into feauture/kernel-implementation Feb 1, 2026

Edwardvaneechoud deleted the claude/persist-kernels-database-XJeP9 branch February 1, 2026 09:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add kernel persistence and multi-user access control#286

Add kernel persistence and multi-user access control#286
Edwardvaneechoud merged 4 commits intofeauture/kernel-implementationfrom
claude/persist-kernels-database-XJeP9

Edwardvaneechoud commented Feb 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Edwardvaneechoud commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Database & Persistence

Kernel Manager Updates

API Authorization

Graceful Shutdown

Testing

Implementation Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Edwardvaneechoud commented Feb 1, 2026 •

edited

Loading