feat: add workflow info to log#4079
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughEmission of WORKFLOW_STARTED was moved from the API to the Workflow: Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake/workflow.py`:
- Around line 293-297: The function info_conda_env directly indexes os.environ
causing a KeyError when no conda env is active; change info_conda_env to use
os.environ.get("CONDA_DEFAULT_ENV") and os.environ.get("CONDA_PREFIX") (or check
membership) to safely build the string, treat missing/empty values as absent and
return "n/a" when both are missing or result in " ()", and ensure info_header
(where info_conda_env is used) receives a stable string instead of raising.
- Line 314: The header currently references the global `workflow.configfiles`;
change it to use the instance attribute `self.configfiles` instead (update the
code that emits "Config file(s): {workflow.configfiles}" to use
`self.configfiles`) so the message reflects the current Workflow instance rather
than a mutable global; locate the string construction in the Workflow class
(method that generates headers in src/snakemake/workflow.py) and replace the
global reference with the instance property `self.configfiles`.
- Around line 1311-1315: The logger.info call for LogEvent.WORKFLOW_STARTED uses
a misspelled key and the wrong Snakefile value; change the metadata key from
worklow_id to workflow_id and supply the workflow's main Snakefile path instead
of self.snakefile (e.g. use the attribute that holds the workflow’s main
Snakefile such as self.workflow.snakefile or self.main_snakefile) in the extra
payload for the logger.info(self.info_header, extra=...) that emits
LogEvent.WORKFLOW_STARTED.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b753adbf-312a-43af-a886-bc216f378bab
📒 Files selected for processing (3)
src/snakemake/api.pysrc/snakemake/logging.pysrc/snakemake/workflow.py
💤 Files with no reviewable changes (1)
- src/snakemake/api.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake/workflow.py`:
- Line 280: The code uses os.getlogin() to populate "user", which can raise
OSError in headless environments; replace it with a robust fallback such as
using getpass.getuser() (or wrap os.getlogin() in a try/except and fall back to
getpass.getuser() or os.environ.get("USER","unknown")) so the "user" field is
reliably populated; update the site that constructs the dict (the line with
"user": os.getlogin()) to use the new fallback approach.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 60e63a68-e753-4896-9c8c-3e231fa3f8c9
📒 Files selected for processing (2)
src/snakemake/logging.pysrc/snakemake/workflow.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/snakemake/workflow.py (2)
1294-1296:⚠️ Potential issue | 🟠 MajorUse the workflow’s main Snakefile for the
snakefileevent field.Line 1295 uses
self.snakefile, which resolves to the call-site Python file (workflow.py) here, not the user workflow file. This can mislead downstream log consumers.Proposed fix
- snakefile=self.snakefile, + snakefile=self.main_snakefile,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/snakemake/workflow.py` around lines 1294 - 1296, The event is populating the snakefile field with self.snakefile (the call-site file) instead of the user's main Snakefile; update the event construction to use self.main_snakefile for the snakefile field (i.e., replace snakefile=self.snakefile with snakefile=self.main_snakefile) so the emitted event (alongside snakefile_main=self.main_snakefile and workflow_id=uuid.uuid4()) correctly reports the user workflow file.
287-288:⚠️ Potential issue | 🔴 CriticalGuard conda env lookups to prevent startup
KeyError.Line 287 and Line 288 directly index
os.environ; this crashes when Snakemake is not running inside a conda environment.Proposed fix
+ conda_default_env = os.environ.get("CONDA_DEFAULT_ENV") + conda_prefix = os.environ.get("CONDA_PREFIX") + return { @@ - "conda_env": os.environ["CONDA_DEFAULT_ENV"], - "conda_prefix": os.environ["CONDA_PREFIX"], + "conda_env": conda_default_env or "n/a", + "conda_prefix": conda_prefix or "n/a",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/snakemake/workflow.py` around lines 287 - 288, The code currently indexes os.environ for CONDA_DEFAULT_ENV and CONDA_PREFIX (setting "conda_env" and "conda_prefix") which raises KeyError outside a conda environment; change those lookups to safe retrieval (e.g., os.environ.get or os.getenv) and/or default to None/empty string so startup won't crash; update the places that set "conda_env" and "conda_prefix" in the workflow creation to use the safe getter (references: "conda_env", "conda_prefix", and os.environ usage in the workflow code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake/workflow.py`:
- Around line 283-285: Replace the shell-based call populating the
"conda_version" metadata (currently using subprocess.getoutput("conda
--version").removeprefix("conda ")) with a non-shell probe: first check for
conda presence using shutil.which("conda"), and if found call
subprocess.run(["conda", "--version"], capture_output=True, text=True,
check=False) to read stdout and strip the "conda " prefix; if not found or the
call fails, set the value to an empty string (or None) to preserve graceful
fallback. Update the code that assigns the "conda_version" key accordingly
(refer to the "conda_version" dict key and the surrounding metadata assembly) to
match the project's existing pattern used in shell.py / singularity.py / cwl.py.
---
Duplicate comments:
In `@src/snakemake/workflow.py`:
- Around line 1294-1296: The event is populating the snakefile field with
self.snakefile (the call-site file) instead of the user's main Snakefile; update
the event construction to use self.main_snakefile for the snakefile field (i.e.,
replace snakefile=self.snakefile with snakefile=self.main_snakefile) so the
emitted event (alongside snakefile_main=self.main_snakefile and
workflow_id=uuid.uuid4()) correctly reports the user workflow file.
- Around line 287-288: The code currently indexes os.environ for
CONDA_DEFAULT_ENV and CONDA_PREFIX (setting "conda_env" and "conda_prefix")
which raises KeyError outside a conda environment; change those lookups to safe
retrieval (e.g., os.environ.get or os.getenv) and/or default to None/empty
string so startup won't crash; update the places that set "conda_env" and
"conda_prefix" in the workflow creation to use the safe getter (references:
"conda_env", "conda_prefix", and os.environ usage in the workflow code).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8718fba6-2b81-4160-8468-25afd188889d
📒 Files selected for processing (2)
src/snakemake/logging.pysrc/snakemake/workflow.py
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/snakemake/logging.py (1)
161-183: Harden workflow-start formatting against missing keys.
"".format(**msg)will raiseKeyErrorif any field is absent, which can break log emission on schema drift. Consider merging with defaults before formatting.Suggested resilience patch
def format_workflow_started(self, msg: dict[str, Any]): """Format the workflow_started log messages.""" + payload = { + "datetime": "<unknown>", + "workflow_id": "<unknown>", + "platform": "<unknown>", + "host": "<unknown>", + "user": "<unknown>", + "snakemake_version": "<unknown>", + "python_version": "<unknown>", + "conda_version": "<unknown>", + "conda_env": "<unknown>", + "conda_prefix": "<unknown>", + "cmd": "<unknown>", + "snakefile_main": "<unknown>", + "basedir": "<unknown>", + "rundir": "<unknown>", + "cwd": "<unknown>", + "configfiles": "<unknown>", + } + payload.update(msg) return """ SNAKEMAKE ========= Date: {datetime} @@ Working directory: {cwd} Config file(s): {configfiles} -""".format(**msg) +""".format(**payload)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/snakemake/logging.py` around lines 161 - 183, format_workflow_started currently calls "".format(**msg) which raises KeyError if any expected key is missing; to fix, merge msg with a dictionary of safe default values (e.g., empty strings or placeholders) before formatting so all keys used in the template are present, then call the format on that merged mapping; update format_workflow_started to copy/merge the incoming msg and setdefaults for keys like datetime, workflow_id, platform, host, user, snakemake_version, python_version, conda_version, conda_env, conda_prefix, cmd, snakefile_main, basedir, rundir, cwd, configfiles so logging remains resilient to schema drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/snakemake/logging.py`:
- Around line 161-183: format_workflow_started currently calls "".format(**msg)
which raises KeyError if any expected key is missing; to fix, merge msg with a
dictionary of safe default values (e.g., empty strings or placeholders) before
formatting so all keys used in the template are present, then call the format on
that merged mapping; update format_workflow_started to copy/merge the incoming
msg and setdefaults for keys like datetime, workflow_id, platform, host, user,
snakemake_version, python_version, conda_version, conda_env, conda_prefix, cmd,
snakefile_main, basedir, rundir, cwd, configfiles so logging remains resilient
to schema drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bcfeccfd-ea59-43eb-b86e-8f6931e54d94
📒 Files selected for processing (2)
src/snakemake/logging.pytests/test_logging.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/snakemake/workflow.py (1)
298-302:⚠️ Potential issue | 🔴 CriticalPrevent startup
KeyErrorwhen conda is installed but no env is activated.Line 298 and Line 301 directly index
os.environbased onshutil.which("conda"). That condition does not guaranteeCONDA_DEFAULT_ENV/CONDA_PREFIXexist, so startup logging can crash.Proposed fix
`@property` def info_header(self): import os import sys import shutil import getpass from datetime import datetime from snakemake.common import __version__ + conda_default_env = os.environ.get("CONDA_DEFAULT_ENV") + conda_prefix = os.environ.get("CONDA_PREFIX") + return { @@ - "conda_env": ( - os.environ["CONDA_DEFAULT_ENV"] if shutil.which("conda") else "n/a" - ), - "conda_prefix": ( - os.environ["CONDA_PREFIX"] if shutil.which("conda") else "n/a" - ), + "conda_env": conda_default_env or "n/a", + "conda_prefix": conda_prefix or "n/a",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/snakemake/workflow.py` around lines 298 - 302, The startup logging accesses os.environ["CONDA_DEFAULT_ENV"] and os.environ["CONDA_PREFIX"] guarded only by shutil.which("conda"), which can still raise KeyError if no conda env is activated; change those lookups in the block that builds the dict (the entries labeled "CONDA_DEFAULT_ENV"/"conda_prefix") to use os.environ.get("CONDA_DEFAULT_ENV", "n/a") and os.environ.get("CONDA_PREFIX", "n/a") (or os.getenv) so they safely fall back to "n/a" even when the keys are missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/snakemake/workflow.py`:
- Around line 285-287: The subprocess call uses a hardcoded "conda" which can be
ambiguous and trigger security warnings; capture the resolved path from
shutil.which and pass that to subprocess.run instead. Change the code that
checks shutil.which("conda") to assign its result (e.g., via a walrus operator
into conda_path or by setting conda_path before the subprocess call) and then
call subprocess.run with [conda_path, "--version"] (reference symbols:
shutil.which, subprocess.run, conda_path) to avoid hardcoded paths and duplicate
lookups.
---
Duplicate comments:
In `@src/snakemake/workflow.py`:
- Around line 298-302: The startup logging accesses
os.environ["CONDA_DEFAULT_ENV"] and os.environ["CONDA_PREFIX"] guarded only by
shutil.which("conda"), which can still raise KeyError if no conda env is
activated; change those lookups in the block that builds the dict (the entries
labeled "CONDA_DEFAULT_ENV"/"conda_prefix") to use
os.environ.get("CONDA_DEFAULT_ENV", "n/a") and os.environ.get("CONDA_PREFIX",
"n/a") (or os.getenv) so they safely fall back to "n/a" even when the keys are
missing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 09fac4c1-9810-474d-b77b-7bd79e75892c
📒 Files selected for processing (1)
src/snakemake/workflow.py
this is a temporary bandaid to solve snakemake#4058 since the workflow started logger call now logs the host info. in the near future i will clean this up with some more logging refinements
🤖 I have created a release *beep* *boop* --- ## [9.22.0](v9.21.1...v9.22.0) (2026-06-01) ### Features * add semantic helper functions choose_file/choose_folder/choose_tmp and allow all semantic helper functions to be used when parsing resources ([#3820](#3820)) ([d3c2386](d3c2386)) * add temp to default pathvars ([#4108](#4108)) ([917fb11](917fb11)) * add workflow info to log ([#4079](#4079)) ([e40e15b](e40e15b)) * support python 3.14 ([#3739](#3739)) ([7e3be0c](7e3be0c)) ### Bug Fixes * Adapt for dataclasses._MISSING_TYPE replaced with sentinel in Python 3.15 ([#4211](#4211)) ([b7fb8df](b7fb8df)) * ensure that storage is not actually retrieved with --touch ([#4212](#4212)) ([2726cae](2726cae)) * symlink to directory "has older modification time" than target ([#3782](#3782)) ([#3784](#3784)) ([41903d8](41903d8)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
QC
docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).Summary by CodeRabbit
New Features
Bug Fixes / Behavior