Error with recursive submodule checkout (fatal: transport 'file' not allowed)
**TL;DR:** As a workaround, force a full clone: Set `GIT_STRATEGY: clone` (possibly in the UI), see https://gitlab.com/gitlab-org/gitlab-runner/-/issues/38908#note_2696065209 ## Summary Running a CI job with `GIT_SUBMODULE_STRATEGY: recursive` fails with `fatal: transport 'file' not allowed` for nested submodules if the commit for the submodule was not fetched before, when using `git fetch` strategy. ## Steps to reproduce <!-- What do you need to do to reproduce the bug? Please include job definitions or git repository structure if relevant --> Create 3 repositories: - `subsubrepo`, with 2 branches: - `main` - `other`, with a commit added w.r.t. main - `subrepo`, with the same 2 branches and submodule `subsubrepo` on the tip commit of the corresponding branch - `toprepo`, again with the same 2 branches and submodule `subrepo` on the tip commit of the corresponding branch Make sure to disabled separate caches for protected & non-protected branches in the CI/CD settings if branches in `toprepo` have different protection levels. Also make sure git strategy in the CI/CD settings is set to `git fetch` (not `git clone`, which can be used as a **workaround** for affected users, btw). Now first run a `GIT_SUBMODULE_STRATEGY: recursive` job on a Linux Docker runner in `toprepo` on the `main` branch. After that job has finished, run another job in `toprepo` on the `other` branch. If you make a mistake, stop the gitlab-runner service and remove the Docker caching volumes. <!-- Please add the definition of the job from `.gitlab-ci.yml` that is failing inside of the code blocks (```) below. --> <details> <summary> .gitlab-ci.yml </summary> ```yml variables: GIT_SUBMODULE_STRATEGY: recursive job1: tags: - linux-docker script: - echo hi ``` </details> ## Actual behavior Fetching `subsubrepo` fails with `fatal: transport 'file' not allowed`. ## Expected behavior Fetching nested submodules should work. ## Relevant logs and/or screenshots <details> <summary> job log </summary> ``` Getting source from Git repository Gitaly correlation ID: [...] Fetching changes... Reinitialized existing Git repository in /builds/mygroup/toprepo/.git/ Created fresh repository. Checking out [...] as detached HEAD (ref is other)... Updating/initializing submodules recursively... Submodule 'subrepo' (https://gitlab-ci-token:[MASKED]@[...]/mygroup/subrepo.git) registered for path 'subrepo' Synchronizing submodule url for 'subrepo' Entering 'subrepo' Entering 'subrepo/subsubrepo' Entering 'subrepo' HEAD is now at [...] Entering 'subrepo/subsubrepo' HEAD is now at [...] From https://[...]/mygroup/subrepo * branch HEAD -> FETCH_HEAD From https://[...]/mygroup/subrepo * branch [...] -> FETCH_HEAD Fetching submodule subsubrepo fatal: transport 'file' not allowed Errors during submodule fetch: subsubrepo fatal: Fetched in submodule path 'subrepo', but it did not contain [...]. Direct fetching of that commit failed. Updating submodules failed. Retrying... Synchronizing submodule url for 'subrepo' From https://[...]/mygroup/subrepo * branch HEAD -> FETCH_HEAD From https://[...]/mygroup/subrepo * branch [...] -> FETCH_HEAD Fetching submodule subsubrepo fatal: transport 'file' not allowed Errors during submodule fetch: subsubrepo fatal: Fetched in submodule path 'subrepo', but it did not contain [...]. Direct fetching of that commit failed. Retrying in 5s Cleaning up project directory and file based variables ERROR: Job failed: exit code 1 ``` </details> ## Environment description Self-hosted Linux Docker runner. <details> <summary> config.toml contents (irrelevant parts removed) </summary> ```toml [[runners]] executor = "linux-docker" [runners.docker] privileged = true volumes = [ "/cache", "/certs/client", "/etc/localtime:/etc/localtime:ro" ] ``` </details> ### Used GitLab Runner version ``` Version: 18.1.1 Git revision: 2b813ade Git branch: 18-1-stable GO version: go1.24.4 X:cacheprog Built: 2025-06-26T16:25:31Z OS/Arch: linux/amd64 ``` ## Possible fixes I spent a lot of time researching the exact problem. Here's what I found: - The GitLab runner caches the repo in a docker volume (one per repo, per level of concurrency). From this volume, [Git config is removed](https://docs.gitlab.com/runner/configuration/advanced-configuration/#cleaning-git-configuration). - When Git doesn't have a remote, as is the case with deleted config, it assumes `origin`, and then tries `origin` as source for fetching. This fetching occurs here in a call to `git fetch origin <commit>` which is executed internally by `git submodule update`. Since `origin` is not an HTTP or SSH source, git assumes it to be a filename, and since the `fetch` command was not executed directly by the user (`git submodule` sets `PROTOCOL_ALLOW_USER_ONLY=0`), file transport is not allowed, leading to the esoteric error message. - GitLab runner does add the remote for `toprepo` using `git remote add origin https://gitlab-ci-token:[MASKED]@[...]/mygroup/toprepo.git`, and adds the URL for `subrepo` in `.git/config` using `git submodule init`, and then propagates it to `.git/modules/subrepo/config` via `git submodule sync --recursive`. However, `git submodule init` is not recursive, so `subsubrepo` does *not* get initialized, meaning that it won't get a remote and won't be affected by `git submodule sync --recursive`. - The runner then retries the submodule update, preceded by a second `git submodule sync --recursive`, but this usually only works when the first submodule update used `--depth=1`, not with a larger depth. (It may also work when `-c submodule.recurse=false` is passed to `git submodule update`, which is not to be confused with the `--recurse` flag.) Note that in these cases the original update command does still report failure. The issue can be reproduced locally using the following script: `repro.sh`: ```bash #!/usr/bin/env bash set -eu -o pipefail shopt -s globstar remote="${1:?Please pass remote for toprepo}" other_ref="${2:-other}" # branch/commit update_depth="${3:-2}" set -x # Trace commands rm -rf ./toprepo/ echo '>>> Making initial clone' # Clone `main` git clone --depth 1 --shallow-submodules --recurse-submodules -- "$remote" ./toprepo cd ./toprepo/ # Simulate GitLab runner cleaning config echo '>>> Removing config' rm ./.git/**/config echo '>>> Re-adding remote' # Simulate GitLab runner re-adding remote git remote add origin -- "$remote" echo '>>> Fetching & checkout out other ref' # Now we try to checkout `other` other_commit="$(git ls-remote --exit-code origin -- "$other_ref" | cut -f1 || echo "$other_ref")" git fetch --depth 1 --no-recurse-submodules origin -- "$other_commit" git checkout --no-recurse-submodules FETCH_HEAD git submodule init git submodule sync --recursive # Fails git submodule update --depth "$update_depth" --recursive --init && echo 'Initial update unexpectedly succeeded?!' >&2 && exit 1 echo '>>> Trying to fix with auto-retry from GitLab runner...' git submodule sync --recursive git submodule update --depth "$update_depth" --recursive --init && echo 'Retried update succeeded!' && exit # The fix read -p 'Observe the error above. Then hit enter to fix it... (or interrupt to examine the repo)' # These command should be between checkout & update git submodule init git submodule foreach --recursive 'git submodule init' git submodule sync --recursive git submodule update --depth "$update_depth" --recursive --init || (echo 'Fixed update unexpectedly failed?!' >&2 && exit 1) ``` This also ends with the **fix** that I would propose: `git submodule foreach --recursive 'git submodule init'`. Call it like so: `./repro.sh git@[...]:mygroup/toprepo`. Passing depth `1` as third argument should make the retry succeed as well. Note that I really spent a *lot* of time trying to figure out all the details, but *still* there are cases where the issue does not pop up even when I would expect it to, and there are cases where the retry sync does work, even when I wouldn't expect it to. I hope the repro works for you. I don't *think* this is necessarily a git bug? Or do you think it is and should it be reported? I hope the fix also works when we nest yet another level deeper, but I find it hard to test this.
issue