Skip to content

Enable multiple revisions for project repos (first attempt 2024)#4659

Closed
marcodelapierre wants to merge 75 commits intomasterfrom
add/mult_revisions
Closed

Enable multiple revisions for project repos (first attempt 2024)#4659
marcodelapierre wants to merge 75 commits intomasterfrom
add/mult_revisions

Conversation

@marcodelapierre
Copy link
Contributor

@marcodelapierre marcodelapierre commented Jan 15, 2024

This PR adds support for handling local copies of multiple revisions of the same pipeline.

Key points:

  • Under NXF_ASSETS, each pipeline is now pulled as <org>/<repo>[:<revision>] ; :<revision> is only appended if the corresponding flag was used on CLI, -r/--revision
  • The key implementation change is adding the revision attribute to the AssetManager class, as both pipeline name and revision are now required to fully identify a pipeline
  • Support has been added and tested for all relevant CLI commands: run, pull, clone, drop, list, view, config, info, inspect, kuberun
  • Unit tests for AssetManagerTest have been updated

Caveats:

  • Jgit does not implement git worktree, so the original idea within Allow the concurrent run of multiple pipeline revisions #2870 could not be applied
  • depth = 1 (shallow clones) was investigated to reduce disk usage, but could not be implemented as it would not allow to checkout branches/tags/commits
  • In the current implementation, pulling without --revision and with --revision <default branch> create two duplicate pulls; this is not optimal, but with very limited known negative impact; [update] this only happens if the default branch is not declared in the manifest and differs from master
  • As a by-product, if a repo has a default branch different from master, pulling and running it is now possible without specifying the branch name explicitly

Closes #2870 .
Also indirectly addresses #3593

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
@netlify
Copy link

netlify bot commented Jan 15, 2024

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 61bd4bf
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/666ff7b4650e42000878ae7a

@marcodelapierre marcodelapierre self-assigned this Jan 15, 2024
@marcodelapierre marcodelapierre changed the title Multiple revisions 1st: AssetManager, Cmd Clone Pull Run Enable multiple revisions for SCMs Jan 15, 2024
@marcodelapierre marcodelapierre marked this pull request as draft January 15, 2024 12:39
@marcodelapierre marcodelapierre changed the title Enable multiple revisions for SCMs Enable multiple revisions for project repos Jan 15, 2024
@bentsherman
Copy link
Member

You could save disk space by cloning with depth = 0 when a revision is specified

@marcodelapierre
Copy link
Contributor Author

marcodelapierre commented Jan 16, 2024

You could save disk space by cloning with depth = 0 when a revision is specified

Thanks @bentsherman , great idea, will do, for nextflow pull but not for clone. Why not for any case, including no revision specified? - I just have to check that a subsequent nextflow pull still works (for the purpose of updating the local copy).

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
@bentsherman
Copy link
Member

yes I suppose we could just clone with depth = 0 in all cases, since no revision just defaults to master. the clone, pull, and run commands all provide a -deep option for this, so users can change it if they want

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
@marcodelapierre
Copy link
Contributor Author

Ah! unfortunately we cannot implement the deep functionality, as with a depth of 1 we cannot checkout to other branches/tags/commits.
This would be supported : git clone -b <branch/tag> --deep <deep> <repo> , but only works for branches and tags, not commits. Git still does not support a way to clone to a specific commit straight away.

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
… operation

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
…f "master"

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
@marcodelapierre marcodelapierre marked this pull request as ready for review January 18, 2024 08:11
@marcodelapierre
Copy link
Contributor Author

On point 2. above, wondering whether we should leverage user-provided information more, when it comes to printing out information on repository in use. I.e.:

  • commit ID as de-referenced from user request
  • (if not null) revision as requested by the user
  • (eventually, also infer branch)

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
@marcodelapierre
Copy link
Contributor Author

As dicussed in meeting, on point 1., let me re-implement with the original class signature for AssetManager - no revision attribute. Commit ID and hence local path will be set when the revision is set.

Note on available code so far:

  1. current branch for this PR is add/mult_revisions - impl with local bare repo, and commit id in local path
  2. branch add/mult_revisions_bkp15apr - first impl, uses revision in local path
  3. branch add/mult_revisions_bkp29may - backup of 1.

Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
Signed-off-by: Dr Marco Claudio De La Pierre <marco.delapierre@gmail.com>
@marcodelapierre marcodelapierre changed the title Enable multiple revisions for project repos Enable multiple revisions for project repos (first attempt 2024) Jun 24, 2024
@marcodelapierre
Copy link
Contributor Author

marcodelapierre commented Jun 24, 2024

Going forward, let's manage these through 2 separate new PRs:

  1. Enable multiple git revisions (Jan-Apr 2024) : temporary, using revision name in repo localPath #5087 : Jan-April implementation, using revision name in repo localPath
  2. Enable multiple git revisions (Jun 2024) : final, using commit ID in repo localPath #5089 : June implementation, using commit ID in repo localPath

The first one is for the record and for reference, unlikely to go forward.

The second one is that to be considered for revision and merging.

@bentsherman
Copy link
Member

Closing in favor of #6620

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Main branch not recognised when executing an online workflow Allow the concurrent run of multiple pipeline revisions

3 participants