Skip to content

feat: Allow storing snakemake metadata in files or databases#4012

Merged
tedil merged 50 commits intomainfrom
persistence-refactor
Mar 13, 2026
Merged

feat: Allow storing snakemake metadata in files or databases#4012
tedil merged 50 commits intomainfrom
persistence-refactor

Conversation

@tedil
Copy link
Copy Markdown
Contributor

@tedil tedil commented Mar 9, 2026

This PR allows choosing whether metadata stored in .snakemake/metadata is stored as files or in a database.
To this end, we expose --persistence-backend [file|db] and --persistence-backend-db-url URL] cli options to choose between the two.
There is NO conversion (between files and db backend) logic implemented.
There is also NO db schema migration logic implemented, yet (will leave that as future work ;)).

Resolves #2969 (well, at least I hope so)

Summary by CodeRabbit

  • New Features

    • CLI options to select persistence backend (file or db) and to supply a DB URL.
  • Refactor

    • Persistence redesigned into a pluggable, version-aware backend system with file- and SQLite-backed implementations, migration, locking, backup and cache support; workflow now selects backend dynamically.
  • Tests

    • Tests updated and parameterized to exercise both persistence backends.

@tedil tedil requested a review from cademirch March 9, 2026 15:16
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a pluggable, versioned persistence system: introduces PersistenceBase, FilePersistence, and DbPersistence; removes the old monolithic persistence module; wires new CLI options and WorkflowSettings to select backend and DB URL; updates workflow initialization and tests to exercise both backends.

Changes

Cohort / File(s) Summary
CLI & Settings
src/snakemake/cli.py, src/snakemake/settings/types.py
Add CLI options --persistence-backend and --persistence-backend-db-url; add persistence_backend and persistence_backend_db_url to WorkflowSettings; forward options into API/Workflow.
Persistence package (core API)
src/snakemake/persistence/__init__.py
Add PersistenceBase, MetadataRecord, ParamsChange, mixins (EnvironmentMaintenanceMixin, FileBackupMixin), IO-cache, locking, metadata accessors, change-detection, async helpers and housekeeping utilities.
File backend
src/snakemake/persistence/file.py
Add FilePersistence implementing on-disk record layout, atomic read/write/delete, v1→v2 migration, path encoding, LRU cache, incomplete tracking, and lock-file management.
Database backend
src/snakemake/persistence/db.py
Add DbPersistence using SQLAlchemy ORM (MetadataRecordORM, LockORM), SQLite pragmas, table creation, caching, upsert/write/read/delete, incomplete-key queries, and lock storage.
Removed legacy module
src/snakemake/persistence.py
Remove previous single-file Persistence implementation, its Persistence class, ParamsChange, and all accompanying helpers and public API.
Workflow integration
src/snakemake/workflow.py
Replace direct Persistence instantiation with dynamic backend selection using workflow_settings.persistence_backend; import FilePersistence/DbPersistence and adopt PersistenceBase type hints; pass DB URL when required.
Tests & Test helpers
tests/common.py, tests/test_persistence.py, tests/tests.py
Expose persistence backend options in test runner, switch tests to import specific backends, parametrize tests to exercise both file and db backends, and seed DB in tests when using SQLite backend.

Sequence Diagram(s)

sequenceDiagram
    participant User as CLI User
    participant CLI as CLI Parser
    participant Settings as WorkflowSettings
    participant Workflow as Workflow
    participant Backend as Persistence Backend

    User->>CLI: snakemake --persistence-backend db --persistence-backend-db-url <url>
    CLI->>CLI: parse arguments
    CLI->>Settings: construct with persistence_backend & persistence_backend_db_url
    Settings-->>Workflow: passed into Workflow constructor
    Workflow->>Workflow: inspect settings.persistence_backend
    alt backend == "db"
        Workflow->>Backend: instantiate DbPersistence(db_url=<url>)
    else backend == "file"
        Workflow->>Backend: instantiate FilePersistence()
    end
    Backend-->>Workflow: return PersistenceBase instance
    Workflow->>Workflow: store as _persistence
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description explains the feature and links to the relevant issue, but lacks completion of the QC checklist for test coverage and documentation updates. Complete the QC checklist by explicitly confirming test coverage and documentation status, or update the description to address these items.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main feature: adding support for file-based or database-backed persistence of Snakemake metadata.
Linked Issues check ✅ Passed The PR implements the core requirement from issue #2969 by introducing database-backed persistence as an alternative to file-based metadata storage, reducing IOPS overhead on HPC systems.
Out of Scope Changes check ✅ Passed All changes directly support the persistence backend feature. CLI options, settings, backend implementations, and test updates are all within scope; no conversion or schema migration logic is included, as explicitly noted.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch persistence-refactor
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@cademirch cademirch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this all looks reasonable to me. one comment regarding multiple inheritance, disregard if it doesnt make sense. I agree it makes sense to split out the backup and env managing, but I tend to avoid multiple inheritance if possible.

@tedil tedil changed the title feat: Introduce AbstractPersistence to allow separating persistence backends later on feat: Introduce PersistenceBase to allow separate persistence backends Mar 10, 2026
@tedil
Copy link
Copy Markdown
Contributor Author

tedil commented Mar 10, 2026

I wonder if we could abstract this even further, given that the methods usually do almost exactly the same, but the db backend just queries the db first and converts to/from representations…

@tedil
Copy link
Copy Markdown
Contributor Author

tedil commented Mar 10, 2026

I wonder if we could abstract this even further, given that the methods usually do almost exactly the same, but the db backend just queries the db first and converts to/from representations…

Yup, so the "backends" basically just do their specific record reading/writing now, so mostly io related stuff; everything else is handled in the base class.

Also added some autogenerated docstrings for some of the abstract methods to make it a bit clearer on how to implement them.

@tedil
Copy link
Copy Markdown
Contributor Author

tedil commented Mar 10, 2026

Ah, and right now, I just enabled the db backend by default to see if the CI succeeds with that or not.
(there's no option yet to toggle that from the cli/config; technically, may not be needed if we make it a plugin, not sure about that, needs more discussion)

@tedil tedil changed the title feat: Introduce PersistenceBase to allow separate persistence backends feat: Add --persistence-backend [file|db] and --persistence-backend-db-url URL to allow storing snakemake metadata in files or databases Mar 10, 2026
@tedil tedil changed the title feat: Add --persistence-backend [file|db] and --persistence-backend-db-url URL to allow storing snakemake metadata in files or databases feat: Allow storing snakemake metadata in files or databases Mar 10, 2026
@tedil tedil marked this pull request as ready for review March 10, 2026 14:55
@tedil tedil requested a review from johanneskoester as a code owner March 10, 2026 14:55
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
src/snakemake/persistence/file.py (2)

154-172: Redundant path computation.

path is computed twice (lines 155 and 158). The second computation is unnecessary.

♻️ Remove redundant line
     def _io_read(self, subject, id):
         path = self._record_path(subject, id)
         if not os.path.exists(path):
             return dict()
-        path = self._record_path(subject, id)
         with open(path, "r") as f:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/persistence/file.py` around lines 154 - 172, In _io_read remove
the redundant recomputation of path: compute path once by calling
self._record_path(subject, id) at the top of the function and reuse that
variable for the os.path.exists check and the subsequent open(), i.e. delete the
second call to self._record_path(subject, id) so the function uses the single
path variable throughout.

178-185: @lru_cache on instance method can cause memory leaks.

The lru_cache decorator holds references to self, preventing garbage collection of the FilePersistence instance. Since FilePersistence is typically long-lived per workflow and _clear_cache() exists, this may be acceptable, but consider using a cache that doesn't reference self if this class might be instantiated multiple times.

♻️ Alternative using explicit cache dictionary
+    def __init__(self, ...):
+        ...
+        self._record_cache = {}
+
-    `@lru_cache`()
-    def _read_record_cached(self, key: str) -> Optional[MetadataRecord]:
+    def _read_record_cached(self, key: str) -> Optional[MetadataRecord]:
+        if key in self._record_cache:
+            return self._record_cache[key]
         rec = self._io_read(self._metadata_path, key)
-        return (
+        result = (
             MetadataRecord(**{k: v for k, v in rec.items() if k in VALID_METADATA_KEYS})
             if rec
             else None
         )
+        self._record_cache[key] = result
+        return result
+
+    def _clear_cache(self) -> None:
-        self._read_record_cached.cache_clear()
+        self._record_cache.clear()
         self._incomplete_cache = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/persistence/file.py` around lines 178 - 185, The `@lru_cache` on
the instance method _read_record_cached retains references to self and can leak
FilePersistence instances; replace it with an instance-level cache (e.g.
self._read_record_cache dict) used inside _read_record_cached to store/return
results keyed by the method's key (and include self._metadata_path if records
vary by path), and update/extend the existing _clear_cache() to clear that dict;
locate usages in the FilePersistence class (methods _read_record_cached and
_clear_cache) and ensure thread-safety if needed (e.g., simple locking) when
accessing the cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/cli.py`:
- Around line 1579-1585: The CLI help for the --persistence-backend-db-url
argument currently includes a password-bearing example DSN; update the help text
in the group_behavior.add_argument call for "--persistence-backend-db-url" to
remove the password from the example (e.g., use "postgresql://user@host/db" or
"postgresql://host/db") and add a short note directing users to provide
credentials via profiles or environment-based configuration instead of passing
secrets on the command line.

In `@src/snakemake/persistence/__init__.py`:
- Around line 173-185: The cleanup code that wraps shutil.rmtree and Path.unlink
currently raises WorkflowError without chaining, losing the original exception;
modify the exception handlers around the rmtree call (in the block handling
removal of conda env at variable address) and the yaml unlink block
(yaml_path.unlink()) to catch OSError (or Exception if broader) and re-raise
WorkflowError using exception chaining (raise WorkflowError(f"...: {e}") from e)
so the original filesystem exception and traceback are preserved.

In `@src/snakemake/persistence/db.py`:
- Around line 142-157: The code eagerly creates a stub MetadataRecordORM in
_mark_incomplete which leaves behind empty rows when metadata retention is
disabled; update _mark_incomplete to avoid creating a new MetadataRecordORM when
only marking incomplete (i.e., only session.get(MetadataRecordORM, key) and set
incomplete/external_jobid on existing record, do not instantiate a new
MetadataRecordORM), and also harden _unmark_incomplete to remove stub records
created solely for the incomplete marker (detect records with
record_format_version == 0 and no provenance/metadata fields and call
session.delete(record) instead of clearing incomplete) so
PersistenceBase.finished() / has_metadata() no longer observes dropped metadata.

In `@src/snakemake/persistence/file.py`:
- Around line 240-259: _write_locks currently overwrites
self._lockfile[lock_type] so repeated calls orphan earlier lock files; change
the storage to keep multiple files per type (e.g., make
self._lockfile[lock_type] a list) and append each new lockfile instead of
assigning, using something like self._lockfile.setdefault(lock_type,
[]).append(lockfile) in _write_locks; update _delete_locks to iterate all stored
paths for each lock_type (handle lists of paths, remove each file while keeping
the same OSError handling) and then clear the mapping.
- Around line 138-152: _in _io_delete_, recdirs is computed as a path relative
to subject then passed to os.removedirs which treats paths relative to cwd;
change to call os.removedirs with the actual directory path of the removed
record (use os.path.dirname(recpath) or an absolute path derived from recpath)
instead of the relative recdirs, and keep the existing check so you only call
removedirs when the directory isn't "." to avoid removing unexpected
directories.

In `@src/snakemake/workflow.py`:
- Around line 917-924: The persistence selection currently passes a None db_url
and silently accepts unknown backends; update the code in the block that checks
self.workflow_settings.persistence_backend so that if the backend == "db" you
only include "db_url" in persistence_kwargs when
self.workflow_settings.persistence_backend_db_url is truthy (do not forward
None), and if the backend value is neither "db" nor "file" raise WorkflowError;
ensure you set persistence = DbPersistence (with persistence_kwargs possibly
empty) for "db" and persistence = FilePersistence for the explicit "file"
backend.

---

Nitpick comments:
In `@src/snakemake/persistence/file.py`:
- Around line 154-172: In _io_read remove the redundant recomputation of path:
compute path once by calling self._record_path(subject, id) at the top of the
function and reuse that variable for the os.path.exists check and the subsequent
open(), i.e. delete the second call to self._record_path(subject, id) so the
function uses the single path variable throughout.
- Around line 178-185: The `@lru_cache` on the instance method _read_record_cached
retains references to self and can leak FilePersistence instances; replace it
with an instance-level cache (e.g. self._read_record_cache dict) used inside
_read_record_cached to store/return results keyed by the method's key (and
include self._metadata_path if records vary by path), and update/extend the
existing _clear_cache() to clear that dict; locate usages in the FilePersistence
class (methods _read_record_cached and _clear_cache) and ensure thread-safety if
needed (e.g., simple locking) when accessing the cache.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 82593f8e-10d6-443e-a3ba-1edaad4bc1e1

📥 Commits

Reviewing files that changed from the base of the PR and between b88171c and e757701.

⛔ Files ignored due to path filters (1)
  • pyproject.toml is excluded by !pyproject.toml
📒 Files selected for processing (10)
  • src/snakemake/cli.py
  • src/snakemake/persistence.py
  • src/snakemake/persistence/__init__.py
  • src/snakemake/persistence/db.py
  • src/snakemake/persistence/file.py
  • src/snakemake/settings/types.py
  • src/snakemake/workflow.py
  • tests/common.py
  • tests/test_persistence.py
  • tests/tests.py
💤 Files with no reviewable changes (1)
  • src/snakemake/persistence.py

@github-actions
Copy link
Copy Markdown
Contributor

Please format your code with pixi run format

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
src/snakemake/persistence/__init__.py (1)

483-517: Compute input checksums once per job.

These digests depend only on job.input, so the current placement repeats the same checksum I/O once for every output file.

Suggested refactor
+            checksums = {
+                str(infile): await infile.checksum(self.max_checksum_file_size)
+                for infile in job.input
+            }
+            checksums = {k: v for k, v in checksums.items() if v is not None}
+
             for f in job.output:
                 key = self._get_key(f)
                 endtime = (
                     (await f.mtime()).local_or_storage()
                     if await f.exists()
                     else fallback_time
                 )
-
-                checksums = {
-                    str(infile): await infile.checksum(self.max_checksum_file_size)
-                    for infile in job.input
-                }
-                checksums = {k: v for k, v in checksums.items() if v is not None}
 
                 record = self._read_record(key) or MetadataRecord()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/persistence/__init__.py` around lines 483 - 517, The loop over
"for f in job.output" recomputes input file checksums for every output; move the
checksum collection out of the loop and compute it once per job by evaluating
the dict comprehension that builds "checksums" (based on job.input and
self.max_checksum_file_size) before the loop, filter None values into
"input_checksums", and then inside the loop assign record.input_checksums =
input_checksums; adjust placement so symbols _get_key, _read_record,
_write_record, MetadataRecord, RECORD_FORMAT_VERSION, and any uses of
endtime/fallback_time remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/persistence/db.py`:
- Around line 112-164: _read_record_cached is currently unbounded and not
invalidated on mutations; add a bounded cache and ensure mutations clear it.
Change `@lru_cache`(maxsize=None) on _read_record_cached to a bounded size like
`@lru_cache`(maxsize=1024) (or another appropriate maxsize for your workload), and
call self._clear_cache() at the end of each mutating method (_write_record,
_delete_record, _mark_incomplete, _unmark_incomplete) so the cached results are
invalidated after inserts/updates/deletes; keep the existing _clear_cache
implementation that calls self._read_record_cached.cache_clear().
- Around line 76-98: The DbPersistence __init__ currently ignores the provided
path when db_url is omitted; change the logic so if db_url is None (or omitted)
you construct the sqlite URL from the instance path: compute db_root = self.path
or Path.cwd(), db_file = (Path(db_root) / ".snakemake" /
"metadata.db").resolve(), then set self.engine =
create_engine(f"sqlite:///{db_file}") so the metadata DB lives under the
provided persistence root; reference DbPersistence.__init__, the db_url
parameter, and the self.engine creation when making this change.
- Around line 70-73: The current LockORM/table design is fine but _write_locks()
must not use session.merge() (an upsert) because that allows two concurrent
processes to overwrite each other's locks; change _write_locks() to attempt pure
inserts of new LockORM(...) rows (e.g., session.add_all or individual
session.add) and commit, catching sqlalchemy.exc.IntegrityError to detect
existing rows, then rollback and surface a lock-acquisition failure back to
PersistenceBase.lock(); remove/replace session.merge() calls so any duplicate
primary-key insert fails rather than upserting.

---

Nitpick comments:
In `@src/snakemake/persistence/__init__.py`:
- Around line 483-517: The loop over "for f in job.output" recomputes input file
checksums for every output; move the checksum collection out of the loop and
compute it once per job by evaluating the dict comprehension that builds
"checksums" (based on job.input and self.max_checksum_file_size) before the
loop, filter None values into "input_checksums", and then inside the loop assign
record.input_checksums = input_checksums; adjust placement so symbols _get_key,
_read_record, _write_record, MetadataRecord, RECORD_FORMAT_VERSION, and any uses
of endtime/fallback_time remain unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e5300a9b-ac8a-4fe3-84a5-0100357dda73

📥 Commits

Reviewing files that changed from the base of the PR and between e757701 and 61a2bc7.

📒 Files selected for processing (3)
  • src/snakemake/persistence/__init__.py
  • src/snakemake/persistence/db.py
  • src/snakemake/workflow.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/snakemake/persistence/db.py (2)

195-199: ⚠️ Potential issue | 🔴 Critical

Use plain inserts for lock rows.

With file_path as the primary key, session.merge() turns duplicate lock acquisition into overwrite-or-insert. A concurrent process can therefore commit an already-held lock instead of surfacing contention. Use insert-only writes and fail on duplicates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/persistence/db.py` around lines 195 - 199, The _write_locks
method currently uses session.merge(LockORM(...)) which performs upsert
semantics and can overwrite an existing lock; change it to perform insert-only
writes so a duplicate primary-key insertion raises an IntegrityError and
surfaces contention. Replace session.merge(...) with session.add(...) /
session.add_all(...) (creating LockORM instances for each key) and let
session.commit() raise on duplicates, or explicitly catch
sqlalchemy.exc.IntegrityError around the commit in _write_locks to translate it
into a clear lock-contention error; refer to the _write_locks method, LockORM
model and Session/engine when making the change.

112-164: ⚠️ Potential issue | 🟠 Major

Bound and invalidate the metadata read cache.

_read_record_cached() can retain None or stale rows after _write_record(), _delete_record(), _mark_incomplete(), or _unmark_incomplete() because none of those paths clear it. maxsize=None also lets large workflows keep every touched record in memory.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/persistence/db.py` around lines 112 - 164, _read_record_cached
uses lru_cache(maxsize=None) and is never invalidated, so writes/deletes/marks
leave stale or None results cached and unbounded cache can grow unbounded;
change the decorator to a bounded size (e.g., `@lru_cache`(maxsize=1024)) and call
self._clear_cache() at the end of _write_record, _delete_record (before return),
_mark_incomplete, and _unmark_incomplete to invalidate the cache after any
mutation so subsequent _read_record calls see fresh data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/persistence/db.py`:
- Around line 166-186: Both _filter_incomplete_keys and _get_external_jobids
build a single large IN(...) which can exceed SQLite bind limits; change each to
process keys in chunks (e.g., chunk_size = 1000) by iterating over chunks of
keys_list, issuing a separate select per chunk (same where predicates:
MetadataRecordORM.target.in_(chunk) + the existing incomplete/external_jobid
filters), collecting and unioning results into the returned set, and avoid early
returns so empty input still returns set(). Ensure you deduplicate across chunks
and use Session(self.engine) for the batched queries in the same way they are
currently opened.

---

Duplicate comments:
In `@src/snakemake/persistence/db.py`:
- Around line 195-199: The _write_locks method currently uses
session.merge(LockORM(...)) which performs upsert semantics and can overwrite an
existing lock; change it to perform insert-only writes so a duplicate
primary-key insertion raises an IntegrityError and surfaces contention. Replace
session.merge(...) with session.add(...) / session.add_all(...) (creating
LockORM instances for each key) and let session.commit() raise on duplicates, or
explicitly catch sqlalchemy.exc.IntegrityError around the commit in _write_locks
to translate it into a clear lock-contention error; refer to the _write_locks
method, LockORM model and Session/engine when making the change.
- Around line 112-164: _read_record_cached uses lru_cache(maxsize=None) and is
never invalidated, so writes/deletes/marks leave stale or None results cached
and unbounded cache can grow unbounded; change the decorator to a bounded size
(e.g., `@lru_cache`(maxsize=1024)) and call self._clear_cache() at the end of
_write_record, _delete_record (before return), _mark_incomplete, and
_unmark_incomplete to invalidate the cache after any mutation so subsequent
_read_record calls see fresh data.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8cc08e0b-c7a9-4e53-bcc8-280320cb9ea3

📥 Commits

Reviewing files that changed from the base of the PR and between 61a2bc7 and 55c0b5b.

📒 Files selected for processing (1)
  • src/snakemake/persistence/db.py

@cademirch
Copy link
Copy Markdown
Contributor

One immediate request without yet having checked the code: we should include here a test case that ensures that when running with a remote executor (or when having a spawned local job with a run directive), only the main process writes to persistence.

I tried running two of the suggested tests (and just looked at os.getpid and threading.get_ident at the time when _write_record is called) and they all report the same pid and tid, so not sure that's helpful? Or I didn't run them correctly…

A quick test to just spawn a couple hundred workers with 1000 jobs each seems to be unproblematic though, sqlite doesn't run into any issues with that, though may not be the case on network filesystems, which we cannot reasonably test.

To add, in my old testing I ran into issues with subprocesses writing to persistence. It shouldn't occur now, but any workflow with run directive will use subprocesses so worth running those tests as well

@johanneskoester johanneskoester enabled auto-merge (squash) March 13, 2026 10:31
@tedil tedil disabled auto-merge March 13, 2026 10:37
@tedil
Copy link
Copy Markdown
Contributor Author

tedil commented Mar 13, 2026

I'll be adding one more test, sorry ^^

@tedil
Copy link
Copy Markdown
Contributor Author

tedil commented Mar 13, 2026

@cademirch added a test that uses run:, not sure it works well in CI (cores=20 will probably get scaled down, right?), but at least covers potential issues a tiny little bit.
@johanneskoester please review most recent changes (which also fixed a nasty "whoops" where I forgot to update the cli default to use the enum… ^^)

@tedil tedil merged commit dd75f31 into main Mar 13, 2026
59 checks passed
@tedil tedil deleted the persistence-refactor branch March 13, 2026 11:43
@github-project-automation github-project-automation bot moved this from In progress to Done in Snakemake Hackathon 2026 Mar 13, 2026
johanneskoester pushed a commit that referenced this pull request Mar 13, 2026
🤖 I have created a release *beep* *boop*
---


##
[9.17.0](v9.16.3...v9.17.0)
(2026-03-13)


### Features

* Allow storing snakemake metadata in files or databases
([#4012](#4012))
([dd75f31](dd75f31))
* Allow to specify comparison command per-unit test
([#3956](#3956))
([b88171c](b88171c))
* job table orderd topological when run is started
([#4018](#4018))
([75cf506](75cf506))
* lambda functions for priority in rules
([#3253](#3253))
([d2aa226](d2aa226))
* Make on... directive of modules accessible
([#4050](#4050))
([e9f2e1c](e9f2e1c))


### Bug Fixes

* adjust conda tests to not fail on apple silicon; fix
[#4040](#4040)
([#4049](#4049))
([f5b0142](f5b0142))
* allow "--containerize apptainer" to output apptainer format instead of
dockerfile ([#4030](#4030))
([f5cac30](f5cac30))
* apptainer command not recognized when singularity is absent
([#4010](#4010))
([b8162e2](b8162e2))
* capture stderr when tests fail
([#3995](#3995))
([97d74ba](97d74ba))
* **docs:** make Data-dependent conditional execution a complete example
([#4043](#4043))
([3a1d7f2](3a1d7f2))
* don't build the DAG when running unlock. Fixes
[#4000](#4000) and
[#198](#198)
([#4007](#4007))
([acf79fd](acf79fd))
* Ensure pixi tasks may be run as advertised
([#4046](#4046))
([88253c2](88253c2))
* fix checkpoint handling corner cases
([#3870](#3870) and
[#3559](#3559))
([#4015](#4015))
([63f4257](63f4257))
* issue 3642
([#4054](#4054))
([76e6fc2](76e6fc2))
* issue 3815
([#4026](#4026))
([b0eec96](b0eec96))
* logging None in shellcmd context causes error
([#4064](#4064))
([d0652cd](d0652cd))
* lookup function returns default value for empty DataFrame queries
([#4056](#4056))
([f71de97](f71de97))
* make `cache: omit-software` a rule specific property
([#4085](#4085))
([034a9e7](034a9e7))
* reduce number of tests leaving temporary files behind
([#4033](#4033))
([a3a1c97](a3a1c97))
* regression in dynamic resource handling
([#4038](#4038))
([f2c554a](f2c554a))
* somewhat shorter announce message
([#4080](#4080))
([57efc71](57efc71))


### Performance Improvements

* switch reretry with tenacity; decouple container classes (with Python
3.7 compat for old scripts) from rest of the codebase (enabling moving
to newer python versions)
([#4032](#4032))
([ffb19e7](ffb19e7))


### Documentation

* Add AI-assisted contributions policy to contributing guidelines
([#4051](#4051))
([dd70526](dd70526))
* **codebase:** Update & simplify plugin architecture section
([#4052](#4052))
([176cf63](176cf63))
* Correct workflow.source_path() description in documentation
([#4036](#4036))
([45883c5](45883c5))
* fixed wrong code example for collect() function
([#4037](#4037))
([5c85ed8](5c85ed8))
* Minor docs improvements
([#4089](#4089))
([29ea226](29ea226))
* switch to sphinx_design for tabs
([#3976](#3976))
([9674614](9674614))
* typo in the migration table breaking a pip install command
([#4024](#4024))
([66f9dda](66f9dda))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Database instead of file-based logging and book keeping? Performance issues on HPC with many Snakemake users and millions of meta files

3 participants