Improve URI recognition#1736
Conversation
- Consolidate a number of separate ad hoc tests for recognising URIs into Alire.URI.URI_Kind - Add support for origin URLs with scheme "ssh://" - Add bitbucket.org to the list of hosts recognised as git only - Treat ".git/" suffix the same as ".git" - Change publish command to raise errors on URLs with "file:" scheme (sidestepping a bug where the whole URL was treated as a relative path) - Change various error messages - Change handling of some obscure edge cases
2764a45 to
4186694
Compare
|
Compilation currently fails with GNAT 10. Versions 11 and later compile fine, so it's presumably a bug with GNAT, but I haven't been able to find a workaround. Any ideas? If not, is it feasible to bump the minimum version up to 11? |
|
Thanks for the comprehensive write-up. I will have to peruse the changes in this one.
IIRC GNAT 10 is the one shipped in Ubuntu LTS 20.04, which is the one we use for releases as the oldest supported by GitHub action runners. So until that changes, bumping to 11 is not very attractive. I will look into finding a workaround, I have faced similar situations in a few occasions. |
|
There's no rush; I appreciate there's a lot to go through. I wasn't sure how important supporting GNAT 10 was, but that makes sense. In that case I'll keep looking for a workaround too (I've only tried a few obvious things so far). |
|
Turned out to be simpler than expected. Not sure what's up with this one, though. |
|
Please mark ready for review once done so I know to check it. |
Will do. #1745 (which is ready for review at your convenience) seemed a higher priority, but I'm back to working on this one now. |
|
Ah, I thought #1745 relied on some changes here. Will review that one then. |
59d8c81 to
090d5ec
Compare
090d5ec to
6732fcd
Compare
Conflicts in: - src/alire/alire-origins.adb - src/alire/alire-publish.adb - src/alire/alire-uri.ads
|
NB: spellcheck reports several problems but can not report it here. |
So it does. Thanks for pointing that out. |
|
Sorry, @Seb-MCaw, I didn't realize this one was marked as ready for review. Whenever you have one ready for merging, it's better that you explicitly request the review, which will mail me so I won't miss it. |
OK, good to know. |
mosteo
left a comment
There was a problem hiding this comment.
Just a couple of very minor changes, and a few questions to verify my assumptions. Great work!
| -- Formats currently not recognized include: | ||
| -- user@host:/path/to/repo.git [returns Unknown] | ||
| -- host.name:/path/to/repo.git [returns Unknown] | ||
| -- git://host/path/to/repo.git [returns Unknown] | ||
| -- ftp://host/path/to/repo.git [returns Unknown] | ||
| -- svn://(something) [returns Unknown] | ||
| -- https://user:pass@host/repo.git [returns Public_Git] | ||
| -- https://user@host/repo.git [returns Public_Git] | ||
| -- https://user:pass@host/path [returns Public_Other] | ||
| -- https://user@host/path [returns Public_Other] |
There was a problem hiding this comment.
Is it on purpose that those Public_Git are not being recognized as Private_Git?
There was a problem hiding this comment.
As a general rule, I have tried to preserve existing behaviour, except where said behaviour was inconsistent. There was previously no distinction between https URLs with or without a userinfo component, so I have kept that the same.
If you want, I could change Alire.URI_Kind to treat http[s] URLs with a non-empty userinfo component as private, but I haven't really looked into whether this would break anything.
There was a problem hiding this comment.
I see. Well, as long as you have this fresh in your mind, I think it's a good opportunity to plug some holes. In particular, the ones returning Public_* could return Private_* and I don't expect it should affect anything. If you find unexpected troubles we could decide on leaving it as-is.
mosteo
left a comment
There was a problem hiding this comment.
To me this is OK for merging once you give a stab at the Public/Private identification I mention in another comment. Thanks!
|
Thanks, @Seb-MCaw. I see the last changes weren't as light as hoped. |
Currently, there are a number of places where Alire tries to guess the nature of a resource based on its URI (or a path or SCP-style remote location, neither of which is technically a URI).
Alire.URI.Schemeprovides some of this functionality, but it's not always used, and there are additional checks (e.g. looking for a.gitsuffix) which are applied subsequently by the calling code (but only sometimes). With a view to adding better support for ssh based private indexes and origins, and making it easier to support other kinds in future (these, for example), this PR attempts to consolidate all of these checks intoAlire.URIso that they are applied consistently.A few other URI-related things have also been changed in hope of improving consistency.
I believe this PR makes the following substantive changes to
alr's behaviour:The scheme
ssh://is now recognised, though will generally raise an error unless there are other indications of what the URL points to (e.g. a.gitsuffix).file:urls are now always:file://relative/path(problematic because officially this is absolute path/pathon the hostrelative), though user inputs should still be recognised as before (i.e. the relative pathrelative/path) and converted into the formfile:relative/path(which is still not technically valid, but at least no longer ambiguous)gitetc. (which results in more efficient cloning, but not 100% sure if hardlinks will ever cause issues)#and?are interpreted literally (#<commit>fragments were never needed/accepted for user input anyway)bitbucket.orgis now recognised as a git-only host (this was previously based on the list inKnown_Transformable_Hosts, but is now implemented byIs_Known_Git_Host)A
.git/suffix is now treated the same as.git.Recognition of SCP-style git remotes (
git@host:/path) is slightly stricter in some places, such thatgit@host/with/slash:/pathis now always treated as a local path (https://git-scm.com/docs/git-clone#URLS).Alire now adds a
git+prefix instead of.gitsuffix whenever it needs to make a URL explicitly Git (on the basis that the former is specific to Alire, while the latter modifies the actual URL).git+file:index URLs are now converted to absolute paths (though in practice I don't think this changes anything other than the output ofalr index --list, sincegit clonedoes the conversion anyway).alr publishnow givesThe origin cannot use a private remoteforssh://URLs and anything else recognised as private (unless--for-private-indexis supplied), not just those of the formgit@host:pathalr publishing an origin which might be a git remote (e.g.https://github.com/mosteo/aaa/archive/v0.1.tar.gz) as a source archive now gives a warning rather than an un---force-able error.alr publish file:/some/path/now always raises an error unless the path ends with a recognised archive file extension (previously there was a bug where the full URL was treated as a relative path, which meant effectively running the publishing procedure for the current working directory, even if it differed from/some/path/)Pins no longer treat
https:URLs with anyxyz+prefix as a git remote, only those withgit+(or other identifying features).There are some other obscure edge cases and behaviours which have changed, so here is my attempt at a complete account of all changes for each subprogram:
Alire.Index_On_Disk.Directory.Index_DirectoryandAlire.Index_On_Disk.Directory.New_Handlerfile:/path[previously only acceptedfile:///path]Alire.Index_On_Disk.Git.AddVCSs.Git.Handler.Clone. I believe this is now the only place where the URL fragment is used for a commit (because the UI doesn't have a separate commit argument).Alire.Index_On_Disk.Git.New_Handlergit+when they have a.gitor.git/suffix or a known git hostAlire.Index_On_Disk.New_Handlerfile:URLs notfile://git+file:URLs [previously onlyfile:] and Windows paths containing@or+as local indexes for path validation purposesgit+when they have a.gitor.git/suffix or a known git hostgit+file:URLsAlire.Origins.From_TOMLgit+when they have a.gitor.git/suffix or a known git host (unlesshashesis specified) [previously always assumedhttps://was a source archive, and error onssh://]ssh://urls which aren't recognised as a VCS now produce an an error message in line with the equivalent for indexes.Alire.Origins.New_VCS/s forfile:schemes now also applies to mixed-case equivalents (schemes are case insensitive).gitsuffix tohttps://github.com/repoorhttps://gitlab.com/repourls (which I'm pretty sure was redundant)ssh://git urls withoutgit+when they have a.gitsuffix or a known git host [previously raisedunknown VCS URL]file:urls to local paths (file:///some/path, not justfile:/some/path)file:URLs, where it is treated as part of the path).git/suffix identically to.gitAlire.Origins.To_TOMLAlire.Origins.Deployers.*.DeployVCS.*.Handler.Clone[previously concatenated into one parameter with#separator, which was separated again byVCS.*.Handler.Clone]Alire.Origins.Tweaks.Fixed_Originfile:andgit+file:URLs to bare pathsgit+file:URLs into absolute paths, as it already does forfile:URLs (though I don't think it is ever called with such a URL)Alire.Publish.Verify_OriginThe origin cannot use a private remoteerror for all URIs that appear to be private (not just those of the formgit@host:path)Alire.URI.Host(e.g. SCP-style git remotes are recognised)Alire.Publish.Local_RepositoryAlire.Publish.Verify_Origin(sogit+httpsis no longer considered private and local URIs can be forced)git+prefix, not.gitsuffixfile:schemesAlire.Publish.Remote_Origin.gitsuffix (Specifically,URL seems to point to a repository, but no commit was provided.)https://github.com/mosteo/aaa/archive/v0.1.tar.gz) as a source archive now gives a warning rather than a non---force-able error.git/suffix identically to.gitAlire.Publish.Prepare_ArchiveNoif the URL looks like a git repo.Alire.Roots.Editable.Add_Remote_PinVCS.Git.Handler.Clone[previously concatenated into one parameter with#separator, which was separated again byVCS.Git.Handler.Clone]Alire.User_Pins.DeployVCS.Git.Handler.Clone[previously concatenated into one parameter with#separator, which was separated again byVCS.Git.Handler.Clone]Alire.VCSs.*.CloneAlire.VCSs.Git.Transform_To_PublicAlire.URI.Hostgit+prefix, not.gitsuffixAlire.VCSs.Repo(now calledAlire.VCSs.Repo_URL)file:scheme (file://andfile:)git+,hg+andsvn+)Alr.Commands.Publish.Executefile:scheme which end with a recognised archive file extension is unchangedfile:scheme and which do not end with a recognised archive file extension are now treated as a remote origin (as is already the case forgit+file:), in line with the docsdebug: Looking for alire metadata at: /current/working/directory/file:/path/to/repo; presumably this is because theAlire.Rootspackage expects only paths, but I haven't investigated further):alr publish file:/path/to/repo(with or without specifying a commit id) performed the normal publishing procedure with that repo's configured remote as the originurlin the index manifest (notfile:/path/to/workspace), or raised an error if no remote was configuredalr publish file:/path/to/repo(with or without specifying a commit id) gaveerror: No Alire workspace found at file:/path/to/repoerror: unknown VCS URL: file:/path/to/repowhen a commit is specified, anderror: Unable to determine archive format from file extensionotherwiseAlr.Commands.Pin.ExecutesshURIs with known git host or a.git/suffix as remote reposhttp[s]URIs as repositories unless they have agit+prefix,.git[/]suffix or known host.gitas remote repos (unless scheme isgit+file:)Alr.Commands.Withing.Add_With_PinsshURIs with known git host or a.git/suffix as remote reposhttp[s]URIs as repositories unless they have agit+prefix,.git[/]suffix or known host.gitas remote repos (unless scheme isgit+file:)