-
Notifications
You must be signed in to change notification settings - Fork 633
File Not Found Error when using a protected output that contains at least one file inside a subdirectory #2130
Description
Snakemake version
7.23.1
Describe the bug
When trying to use a protected()output that is also a directory(), and if this directory contains files inside sub-directories, the Snakemake workflow fails at the "write-protecting output" step with a "File Not Found error". See below for a minimal example.
Minimal example
rule make_final_output:
output: "Data/final.output"
input: "Data/Protected-dir"
shell: "touch {output} "
rule make_protected_dir:
output: protected(directory("Data/Protected-dir"))
input: "Data/original.input"
shell:
"mkdir -p Data/Protected-dir/Protected-subdir; "
"touch Data/Protected-dir/Protected-subdir/intermediate.file "Logs
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job stats:
job count min threads max threads
------------------ ------- ------------- -------------
make_final_output 1 1 1
make_protected_dir 1 1 1
total 2 1 1
Select jobs to execute...
[Wed Feb 22 09:39:37 2023]
rule make_protected_dir:
input: Data/original.input
output: Data/Protected-dir
jobid: 1
reason: Missing output files: Data/Protected-dir; Code has changed since last execution
resources: tmpdir=/var/folders/0m/8qndwm2d7mn2vkldklvbsk040000gq/T
Write-protecting output file Data/Protected-dir.
Traceback (most recent call last):
File "/lib/python3.11/site-packages/snakemake/__init__.py", line 760, in snakemake
success = workflow.execute(
^^^^^^^^^^^^^^^^^
File "/lib/python3.11/site-packages/snakemake/workflow.py", line 1095, in execute
raise e
File "/lib/python3.11/site-packages/snakemake/workflow.py", line 1091, in execute
success = self.scheduler.schedule()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lib/python3.11/site-packages/snakemake/scheduler.py", line 502, in schedule
self._finish_jobs()
File "/lib/python3.11/site-packages/snakemake/scheduler.py", line 606, in _finish_jobs
self.get_executor(job).handle_job_success(job)
File "/lib/python3.11/site-packages/snakemake/executors/__init__.py", line 682, in handle_job_success
super().handle_job_success(job)
File "/lib/python3.11/site-packages/snakemake/executors/__init__.py", line 270, in handle_job_success
job.postprocess(
File "/lib/python3.11/site-packages/snakemake/jobs.py", line 1158, in postprocess
self.dag.handle_protected(self)
File "/lib/python3.11/site-packages/snakemake/dag.py", line 676, in handle_protected
f.protect()
File "/lib/python3.11/site-packages/snakemake/io.py", line 679, in protect
lchmod(os.path.join(self.file, f), mode)
File "/lib/python3.11/site-packages/snakemake/io.py", line 98, in lchmod
os.chmod(f, mode, follow_symlinks=False)
FileNotFoundError: [Errno 2] No such file or directory: 'Data/Protected-dir/intermediate.file'Additional context
As you can see, Snakemake fails because it tries to write-protect the file Data/Protected-dir/intermediate.file, which does not exist since it should be Data/Protected-dir/Protected-subdir/intermediate.file.
The code that should be modified is located at https://github.com/snakemake/snakemake/blob/main/snakemake/io.py#L670-L680 . Basically,
for f in files:
lchmod(os.path.join(self.file, f), mode)should read instead
for f in files:
lchmod(os.path.join(root, f), mode)See os.walk() documentation, especially:
To get a full path (which begins with top) to a file or directory in dirpath, do os.path.join(dirpath, name).
I’ll submit a pull request asap.
Cheers,
−Nils