Skip to content

listFiles() does not follow symbolic links #6807

@adamrtalbot

Description

@adamrtalbot

Description

The listFiles() method on Path objects does not follow symbolic links, causing it to fail to locate files when:

  1. The directory itself is a symlink
  2. Files within the directory are symlinks

Root Cause

In modules/nf-commons/src/main/nextflow/extension/FilesEx.groovy, the listFiles0 method uses Files.walkFileTree() without specifying FileVisitOption.FOLLOW_LINKS:

private static Collection<Path> listFiles0(Path self, Closure<Boolean> filter=null) {
    if( !self.isDirectory() )
        return null

    final result = []
    Files.walkFileTree(self, new SimpleFileVisitor<Path>() {
        // ...
    })
    return result
}

By default, walkFileTree does not follow symbolic links. This is inconsistent with isDirectory() which does follow symlinks by default.

Workaround Limitation

A suggested workaround is to use toRealPath() to resolve symlinks before calling listFiles():

dir.toRealPath().listFiles()

However, this does not work for cloud storage (S3, Azure, GCS) because:

  • S3: toRealPath() throws UnsupportedOperationException
  • Azure: toRealPath() just returns toAbsolutePath() (no symlink resolution)
  • GCS: Depends on Google's library implementation

Proposed Fix

Add FileVisitOption.FOLLOW_LINKS to the walkFileTree call:

import java.nio.file.FileVisitOption

private static Collection<Path> listFiles0(Path self, Closure<Boolean> filter=null) {
    if( !self.isDirectory() )
        return null

    final result = []
    final walkOptions = EnumSet.of(FileVisitOption.FOLLOW_LINKS)
    
    Files.walkFileTree(self, walkOptions, 1, new SimpleFileVisitor<Path>() {
        // ... same visitor code
    })

    return result
}

This should be safe for cloud filesystems since:

  1. Cloud storage doesn't have symlinks, so FOLLOW_LINKS is effectively a no-op
  2. The S3 client already uses EnumSet.of(FileVisitOption.FOLLOW_LINKS) internally for downloads (S3Client.java:405)

Environment

  • Nextflow version: master branch
  • Affects both local and cloud file systems (local due to symlink issue, cloud due to workaround not being viable)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions