Skip to content

feat: add workflow info to log#4079

Merged
johanneskoester merged 21 commits into
snakemake:mainfrom
fgvieira:extra_info
Jun 1, 2026
Merged

feat: add workflow info to log#4079
johanneskoester merged 21 commits into
snakemake:mainfrom
fgvieira:extra_info

Conversation

@fgvieira

@fgvieira fgvieira commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

Summary by CodeRabbit

  • New Features

    • Startup now logs a richer multi-field header (version, date/time, platform, user/environment, Python/conda info, command context, and config paths) and includes a unique workflow identifier.
  • Bug Fixes / Behavior

    • The workflow-start event is now emitted and shown in end-user logs (no longer suppressed), replacing the simpler host-only startup message for clearer launch visibility.

@coderabbitai

coderabbitai Bot commented Mar 12, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Emission of WORKFLOW_STARTED was moved from the API to the Workflow: Workflow.info_header now composes startup metadata and execute() emits a WORKFLOW_STARTED event with a generated workflow_id. Logging gained format_workflow_started and removed host-only formatting; tests adjusted expected emission count.

Changes

Cohort / File(s) Summary
Logging Infrastructure
src/snakemake/logging.py
Added DefaultFormatter.format_workflow_started and mapped LogEvent.WORKFLOW_STARTED to it; removed format_host; updated filter to allow WORKFLOW_STARTED events.
Workflow Initialization
src/snakemake/workflow.py
Added public info_header property assembling startup metadata (datetime, Snakemake version, platform, host, user, conda/Python/environment, command context, config files); execute() now generates a workflow_id (uuid4) and logs a WORKFLOW_STARTED event with info_header.
API Layer
src/snakemake/api.py
Removed emission of the WORKFLOW_STARTED logging event from execute_workflow.
Tests
tests/test_logging.py
Updated test_plugin expectations: LogEvent.WORKFLOW_STARTED count is now always 1 (removed conditional based on has_filter).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description contains only the empty template with unchecked QC boxes; no actual explanation of changes, objectives, or implementation details are provided. Add a clear description explaining what workflow info is being logged, why it was added, and any relevant implementation details. Check or clarify the status of test coverage and documentation updates.
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main change: adding workflow information to logs via the new info_header property and WORKFLOW_STARTED event.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/workflow.py`:
- Around line 293-297: The function info_conda_env directly indexes os.environ
causing a KeyError when no conda env is active; change info_conda_env to use
os.environ.get("CONDA_DEFAULT_ENV") and os.environ.get("CONDA_PREFIX") (or check
membership) to safely build the string, treat missing/empty values as absent and
return "n/a" when both are missing or result in " ()", and ensure info_header
(where info_conda_env is used) receives a stable string instead of raising.
- Line 314: The header currently references the global `workflow.configfiles`;
change it to use the instance attribute `self.configfiles` instead (update the
code that emits "Config file(s): {workflow.configfiles}" to use
`self.configfiles`) so the message reflects the current Workflow instance rather
than a mutable global; locate the string construction in the Workflow class
(method that generates headers in src/snakemake/workflow.py) and replace the
global reference with the instance property `self.configfiles`.
- Around line 1311-1315: The logger.info call for LogEvent.WORKFLOW_STARTED uses
a misspelled key and the wrong Snakefile value; change the metadata key from
worklow_id to workflow_id and supply the workflow's main Snakefile path instead
of self.snakefile (e.g. use the attribute that holds the workflow’s main
Snakefile such as self.workflow.snakefile or self.main_snakefile) in the extra
payload for the logger.info(self.info_header, extra=...) that emits
LogEvent.WORKFLOW_STARTED.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b753adbf-312a-43af-a886-bc216f378bab

📥 Commits

Reviewing files that changed from the base of the PR and between 176cf63 and a4b2bb2.

📒 Files selected for processing (3)
  • src/snakemake/api.py
  • src/snakemake/logging.py
  • src/snakemake/workflow.py
💤 Files with no reviewable changes (1)
  • src/snakemake/api.py

Comment thread src/snakemake/workflow.py Outdated
Comment thread src/snakemake/workflow.py Outdated
Comment thread src/snakemake/workflow.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/workflow.py`:
- Line 280: The code uses os.getlogin() to populate "user", which can raise
OSError in headless environments; replace it with a robust fallback such as
using getpass.getuser() (or wrap os.getlogin() in a try/except and fall back to
getpass.getuser() or os.environ.get("USER","unknown")) so the "user" field is
reliably populated; update the site that constructs the dict (the line with
"user": os.getlogin()) to use the new fallback approach.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 60e63a68-e753-4896-9c8c-3e231fa3f8c9

📥 Commits

Reviewing files that changed from the base of the PR and between a4b2bb2 and 0adbd69.

📒 Files selected for processing (2)
  • src/snakemake/logging.py
  • src/snakemake/workflow.py

Comment thread src/snakemake/workflow.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/snakemake/workflow.py (2)

1294-1296: ⚠️ Potential issue | 🟠 Major

Use the workflow’s main Snakefile for the snakefile event field.

Line 1295 uses self.snakefile, which resolves to the call-site Python file (workflow.py) here, not the user workflow file. This can mislead downstream log consumers.

Proposed fix
-                snakefile=self.snakefile,
+                snakefile=self.main_snakefile,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/workflow.py` around lines 1294 - 1296, The event is populating
the snakefile field with self.snakefile (the call-site file) instead of the
user's main Snakefile; update the event construction to use self.main_snakefile
for the snakefile field (i.e., replace snakefile=self.snakefile with
snakefile=self.main_snakefile) so the emitted event (alongside
snakefile_main=self.main_snakefile and workflow_id=uuid.uuid4()) correctly
reports the user workflow file.

287-288: ⚠️ Potential issue | 🔴 Critical

Guard conda env lookups to prevent startup KeyError.

Line 287 and Line 288 directly index os.environ; this crashes when Snakemake is not running inside a conda environment.

Proposed fix
+        conda_default_env = os.environ.get("CONDA_DEFAULT_ENV")
+        conda_prefix = os.environ.get("CONDA_PREFIX")
+
         return {
@@
-            "conda_env": os.environ["CONDA_DEFAULT_ENV"],
-            "conda_prefix": os.environ["CONDA_PREFIX"],
+            "conda_env": conda_default_env or "n/a",
+            "conda_prefix": conda_prefix or "n/a",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/workflow.py` around lines 287 - 288, The code currently indexes
os.environ for CONDA_DEFAULT_ENV and CONDA_PREFIX (setting "conda_env" and
"conda_prefix") which raises KeyError outside a conda environment; change those
lookups to safe retrieval (e.g., os.environ.get or os.getenv) and/or default to
None/empty string so startup won't crash; update the places that set "conda_env"
and "conda_prefix" in the workflow creation to use the safe getter (references:
"conda_env", "conda_prefix", and os.environ usage in the workflow code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/workflow.py`:
- Around line 283-285: Replace the shell-based call populating the
"conda_version" metadata (currently using subprocess.getoutput("conda
--version").removeprefix("conda ")) with a non-shell probe: first check for
conda presence using shutil.which("conda"), and if found call
subprocess.run(["conda", "--version"], capture_output=True, text=True,
check=False) to read stdout and strip the "conda " prefix; if not found or the
call fails, set the value to an empty string (or None) to preserve graceful
fallback. Update the code that assigns the "conda_version" key accordingly
(refer to the "conda_version" dict key and the surrounding metadata assembly) to
match the project's existing pattern used in shell.py / singularity.py / cwl.py.

---

Duplicate comments:
In `@src/snakemake/workflow.py`:
- Around line 1294-1296: The event is populating the snakefile field with
self.snakefile (the call-site file) instead of the user's main Snakefile; update
the event construction to use self.main_snakefile for the snakefile field (i.e.,
replace snakefile=self.snakefile with snakefile=self.main_snakefile) so the
emitted event (alongside snakefile_main=self.main_snakefile and
workflow_id=uuid.uuid4()) correctly reports the user workflow file.
- Around line 287-288: The code currently indexes os.environ for
CONDA_DEFAULT_ENV and CONDA_PREFIX (setting "conda_env" and "conda_prefix")
which raises KeyError outside a conda environment; change those lookups to safe
retrieval (e.g., os.environ.get or os.getenv) and/or default to None/empty
string so startup won't crash; update the places that set "conda_env" and
"conda_prefix" in the workflow creation to use the safe getter (references:
"conda_env", "conda_prefix", and os.environ usage in the workflow code).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8718fba6-2b81-4160-8468-25afd188889d

📥 Commits

Reviewing files that changed from the base of the PR and between 0adbd69 and 48ced7c.

📒 Files selected for processing (2)
  • src/snakemake/logging.py
  • src/snakemake/workflow.py

Comment thread src/snakemake/workflow.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/snakemake/logging.py (1)

161-183: Harden workflow-start formatting against missing keys.

"".format(**msg) will raise KeyError if any field is absent, which can break log emission on schema drift. Consider merging with defaults before formatting.

Suggested resilience patch
 def format_workflow_started(self, msg: dict[str, Any]):
     """Format the workflow_started log messages."""
+    payload = {
+        "datetime": "<unknown>",
+        "workflow_id": "<unknown>",
+        "platform": "<unknown>",
+        "host": "<unknown>",
+        "user": "<unknown>",
+        "snakemake_version": "<unknown>",
+        "python_version": "<unknown>",
+        "conda_version": "<unknown>",
+        "conda_env": "<unknown>",
+        "conda_prefix": "<unknown>",
+        "cmd": "<unknown>",
+        "snakefile_main": "<unknown>",
+        "basedir": "<unknown>",
+        "rundir": "<unknown>",
+        "cwd": "<unknown>",
+        "configfiles": "<unknown>",
+    }
+    payload.update(msg)
 
     return """
 SNAKEMAKE
 =========
   Date: {datetime}
@@
   Working directory: {cwd}
   Config file(s): {configfiles}
-""".format(**msg)
+""".format(**payload)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/logging.py` around lines 161 - 183, format_workflow_started
currently calls "".format(**msg) which raises KeyError if any expected key is
missing; to fix, merge msg with a dictionary of safe default values (e.g., empty
strings or placeholders) before formatting so all keys used in the template are
present, then call the format on that merged mapping; update
format_workflow_started to copy/merge the incoming msg and setdefaults for keys
like datetime, workflow_id, platform, host, user, snakemake_version,
python_version, conda_version, conda_env, conda_prefix, cmd, snakefile_main,
basedir, rundir, cwd, configfiles so logging remains resilient to schema drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/snakemake/logging.py`:
- Around line 161-183: format_workflow_started currently calls "".format(**msg)
which raises KeyError if any expected key is missing; to fix, merge msg with a
dictionary of safe default values (e.g., empty strings or placeholders) before
formatting so all keys used in the template are present, then call the format on
that merged mapping; update format_workflow_started to copy/merge the incoming
msg and setdefaults for keys like datetime, workflow_id, platform, host, user,
snakemake_version, python_version, conda_version, conda_env, conda_prefix, cmd,
snakefile_main, basedir, rundir, cwd, configfiles so logging remains resilient
to schema drift.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bcfeccfd-ea59-43eb-b86e-8f6931e54d94

📥 Commits

Reviewing files that changed from the base of the PR and between 48ced7c and 16ffecd.

📒 Files selected for processing (2)
  • src/snakemake/logging.py
  • tests/test_logging.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/snakemake/workflow.py (1)

298-302: ⚠️ Potential issue | 🔴 Critical

Prevent startup KeyError when conda is installed but no env is activated.

Line 298 and Line 301 directly index os.environ based on shutil.which("conda"). That condition does not guarantee CONDA_DEFAULT_ENV / CONDA_PREFIX exist, so startup logging can crash.

Proposed fix
     `@property`
     def info_header(self):
         import os
         import sys
         import shutil
         import getpass
         from datetime import datetime
         from snakemake.common import __version__
 
+        conda_default_env = os.environ.get("CONDA_DEFAULT_ENV")
+        conda_prefix = os.environ.get("CONDA_PREFIX")
+
         return {
@@
-            "conda_env": (
-                os.environ["CONDA_DEFAULT_ENV"] if shutil.which("conda") else "n/a"
-            ),
-            "conda_prefix": (
-                os.environ["CONDA_PREFIX"] if shutil.which("conda") else "n/a"
-            ),
+            "conda_env": conda_default_env or "n/a",
+            "conda_prefix": conda_prefix or "n/a",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/snakemake/workflow.py` around lines 298 - 302, The startup logging
accesses os.environ["CONDA_DEFAULT_ENV"] and os.environ["CONDA_PREFIX"] guarded
only by shutil.which("conda"), which can still raise KeyError if no conda env is
activated; change those lookups in the block that builds the dict (the entries
labeled "CONDA_DEFAULT_ENV"/"conda_prefix") to use
os.environ.get("CONDA_DEFAULT_ENV", "n/a") and os.environ.get("CONDA_PREFIX",
"n/a") (or os.getenv) so they safely fall back to "n/a" even when the keys are
missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/snakemake/workflow.py`:
- Around line 285-287: The subprocess call uses a hardcoded "conda" which can be
ambiguous and trigger security warnings; capture the resolved path from
shutil.which and pass that to subprocess.run instead. Change the code that
checks shutil.which("conda") to assign its result (e.g., via a walrus operator
into conda_path or by setting conda_path before the subprocess call) and then
call subprocess.run with [conda_path, "--version"] (reference symbols:
shutil.which, subprocess.run, conda_path) to avoid hardcoded paths and duplicate
lookups.

---

Duplicate comments:
In `@src/snakemake/workflow.py`:
- Around line 298-302: The startup logging accesses
os.environ["CONDA_DEFAULT_ENV"] and os.environ["CONDA_PREFIX"] guarded only by
shutil.which("conda"), which can still raise KeyError if no conda env is
activated; change those lookups in the block that builds the dict (the entries
labeled "CONDA_DEFAULT_ENV"/"conda_prefix") to use
os.environ.get("CONDA_DEFAULT_ENV", "n/a") and os.environ.get("CONDA_PREFIX",
"n/a") (or os.getenv) so they safely fall back to "n/a" even when the keys are
missing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 09fac4c1-9810-474d-b77b-7bd79e75892c

📥 Commits

Reviewing files that changed from the base of the PR and between 16ffecd and d144702.

📒 Files selected for processing (1)
  • src/snakemake/workflow.py

Comment thread src/snakemake/workflow.py Outdated
this is a temporary bandaid to solve snakemake#4058 since the workflow started logger call now logs the host info.

in the near future i will clean this up with some more logging refinements
@fgvieira fgvieira moved this to In review in Snakemake Hackathon 2026 Mar 13, 2026
@fgvieira fgvieira self-assigned this Mar 13, 2026
@fgvieira fgvieira added the enhancement New feature or request label Mar 13, 2026
Comment thread src/snakemake/workflow.py Outdated
Comment thread src/snakemake/logging.py Outdated
Comment thread src/snakemake/logging.py
Comment thread src/snakemake/logging.py
Comment thread src/snakemake/workflow.py Outdated
@johanneskoester johanneskoester merged commit e40e15b into snakemake:main Jun 1, 2026
59 checks passed
@github-project-automation github-project-automation Bot moved this from In review to Done in Snakemake Hackathon 2026 Jun 1, 2026
@fgvieira fgvieira deleted the extra_info branch June 1, 2026 10:50
johanneskoester pushed a commit that referenced this pull request Jun 1, 2026
🤖 I have created a release *beep* *boop*
---


##
[9.22.0](v9.21.1...v9.22.0)
(2026-06-01)


### Features

* add semantic helper functions choose_file/choose_folder/choose_tmp and
allow all semantic helper functions to be used when parsing resources
([#3820](#3820))
([d3c2386](d3c2386))
* add temp to default pathvars
([#4108](#4108))
([917fb11](917fb11))
* add workflow info to log
([#4079](#4079))
([e40e15b](e40e15b))
* support python 3.14
([#3739](#3739))
([7e3be0c](7e3be0c))


### Bug Fixes

* Adapt for dataclasses._MISSING_TYPE replaced with sentinel in Python
3.15 ([#4211](#4211))
([b7fb8df](b7fb8df))
* ensure that storage is not actually retrieved with --touch
([#4212](#4212))
([2726cae](2726cae))
* symlink to directory "has older modification time" than target
([#3782](#3782))
([#3784](#3784))
([41903d8](41903d8))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants