Skip to content

Repository assets which are not a sibling of the executed workflow are not correctly identified. #6604

@DriesSchaumont

Description

@DriesSchaumont

Bug report

When executing a workflow which resides in a subdirectory (using main-script), other repository assets are not recognized when they are not a child from the same subdirectory. This causes cache invalidation when the repository is fetched again (as is often the case on remote executors like batch).

Expected behavior and actual behavior

When an input is not recognized as being an assets, the default cache strategy is applied instead of using a sha256 checksum or the commit ID (~ deep mode). With the default strategy the metadata of the file is used, but because the repository often cloned with each run, resume never functions because the timestamp of the asset has changed.

The main switch for determining wether an input is an asset is the isAsset function. It checks if the a certain file starts with the base directory (session.getBaseDir()). However, this base directory points to wherever the main script resides and not the root of the repository. isAsset is being used as a switch to determine wether to use the file contents or the user specified hashing strategy (no assets) or to use the file contents regardless of the chosen strategy (assets):

if( (mode==HashMode.STANDARD || mode==HashMode.LENIENT) && isAssetFile(path) ) {

An asset should always be recognized as being part of the repository; regardless of its location. A proposed solution can be found at #6605.

Steps to reproduce the problem

As a sidenote: when the metadata is used for hashing a file, thehashFileMetadata method is called, which outputs Hashing file meta: to the logs; while when the file contents is used a log message of Hash asset file sha-256 is output by e.g. hashFileSha256Impl0.

In order to reproduce the problem, I've created a repository here: https://github.com/DriesSchaumont/nextflow-subfolder-basedir-asset-caching

Cleaning the environment

nextflow drop DriesSchaumont/nextflow-subfolder-basedir-asset-caching > /dev/null; rm -rf .nextflow && rm -f .nextflow.log* && rm -rf work

Running the workflow once

NXF_VER=25.10.0 NXF_TRACE=nextflow.util NXF_PATCH_DIRECTORY_HASH=true nextflow run DriesSchaumont/nextflow-subfolder-basedir-asset-caching -main-script pipeline/main.nf
Pulling DriesSchaumont/nextflow-subfolder-basedir-asset-caching ...
 downloaded from https://github.com/DriesSchaumont/nextflow-subfolder-basedir-asset-caching.git
Launching `https://github.com/DriesSchaumont/nextflow-subfolder-basedir-asset-caching` [spontaneous_mirzakhani] DSL2 - revision: 509ed9c06f [main]

executor >  local (2)
[7e/920c02] test_project (foo) [100%] 2 of 2 ✔
8fbf7ab58288ceb1da97c4b561e59c87e111ccf6  test_assets/template.txt

8fbf7ab58288ceb1da97c4b561e59c87e111ccf6  test_assets/template.txt

Testing resume

This works because the timestamp for the assets is not changed.

NXF_VER=25.10.0 NXF_TRACE=nextflow.util NXF_PATCH_DIRECTORY_HASH=true nextflow run DriesSchaumont/nextflow-subfolder-basedir-asset-caching -main-script pipeline/main.nf -resume
 N E X T F L O W   ~  version 25.10.0

Launching `https://github.com/DriesSchaumont/nextflow-subfolder-basedir-asset-caching` [disturbed_varahamihira] DSL2 - revision: 509ed9c06f [main]

[7e/920c02] test_project (foo) [100%] 2 of 2, cached: 2 ✔
8fbf7ab58288ceb1da97c4b561e59c87e111ccf6  test_assets/template.txt

8fbf7ab58288ceb1da97c4b561e59c87e111ccf6  test_assets/template.txt

Dropping the repository

This causes the repository to be fetched again so the files get a new timestamp.

nextflow drop DriesSchaumont/nextflow-subfolder-basedir-asset-caching

Demonstrate resume not functionning.

NXF_VER=25.10.0 NXF_TRACE=nextflow.util NXF_PATCH_DIRECTORY_HASH=true nextflow run DriesSchaumont/nextflow-subfolder-basedir-asset-caching -main-script pipeline/main.nf -resume
 N E X T F L O W   ~  version 25.10.0

Pulling DriesSchaumont/nextflow-subfolder-basedir-asset-caching ...
 downloaded from https://github.com/DriesSchaumont/nextflow-subfolder-basedir-asset-caching.git
Launching `https://github.com/DriesSchaumont/nextflow-subfolder-basedir-asset-caching` [suspicious_sinoussi] DSL2 - revision: 509ed9c06f [main]

executor >  local (2)
[d1/91219f] test_project (bar) [100%] 2 of 2 ✔
8fbf7ab58288ceb1da97c4b561e59c87e111ccf6  test_assets/template.txt

8fbf7ab58288ceb1da97c4b561e59c87e111ccf6  test_assets/template.txt

Program output

nextflow.log
nextflow-second.log
nextflow-third.log

Environment

  • Nextflow version: 25.10.0
  • Java version: openjdk 21.0.9 2025-10-21
  • Operating system: Debian GNU/Linux 13
  • Bash version: GNU bash, version 5.2.37(1)-release (x86_64-pc-linux-gnu)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions