Skip to content

Handle paths with spaces and hashes especially with nested JARs#805

Merged
lukehutch merged 1 commit into
classgraph:latestfrom
jwachter:fix/nested-with-space-and-hash-path
Nov 2, 2023
Merged

Handle paths with spaces and hashes especially with nested JARs#805
lukehutch merged 1 commit into
classgraph:latestfrom
jwachter:fix/nested-with-space-and-hash-path

Conversation

@jwachter

@jwachter jwachter commented Nov 2, 2023

Copy link
Copy Markdown
Contributor

With nested jars there are two different mechanisms that will be used as the path is not usable as a java.nio.file.Path instance.

The first is trying to convert the resulting nested path - a path like jar:file:....!/some/nested/path - to a URL and if that should fail due to a MalformedURLException it is tried to convert the path to URI. If the URI fallback fails an IOException will be thrown and this eventually will bubble up and discard the whole classpath entry, resulting in a message like the following when enabling verbose output during scanning:

2023-11-02T12:51:42.719+0100	ClassGraph	-- Skipping invalid classpath entry .../spring-boot-fully-executable-jar.jar!/BOOT-INF/lib/... : java.io.IOException: Malformed URI: ...

Most of the time nothing will be discarded as most paths can be converted to a URL in the first step or at least succeed when converting to a URI.

However for paths containing spaces and the hash symbol we can reach a case where both URL conversion and URI conversion fail and so the classpath entry is discarded even though all paths are valid and can be usable.

Let us assume a Spring Boot Executable JAR that is located in a directory named ci-build main #123 - which is a valid directory name on Windows and Linux.

When ClassGraph reaches a nested library here it will construct the paths to the nested jars like jar:file:<path>!/<nested-path>.

So in this case we end up with something like jar:file:/opt/ci-build main #123!/BOOT-INF/lib/my-lib.jar.

When ClassGraph reaches the conversion code it will first try to convert to a URL. This will fail with the following message:

java.net.MalformedURLException: no !/ in spec

If we then fallback to the URI conversion it will try to convert but as our path contains spaces this will also be rejected by an exception:

java.net.URISyntaxException: Illegal character in opaque part at index 66: jar:file:...

The index will point to the first space in the path that is converted.

So we can construct nested paths that are neither valid URL instances nor valid URI instances.

To solve this issue we introduce encoding for spaces when the path is handled as a url or multi-section path to ensure that conversion can succeed. This seems to also be what the java.nio.file.Path API does when asking for the resulting URI for the same path.

So this commit encodes spaces as %20 and hash symbols as %23 when going into the URL/Multi-Section branch.

Fixes #804

@jwachter jwachter force-pushed the fix/nested-with-space-and-hash-path branch from 2246990 to 703a0d5 Compare November 2, 2023 15:21
With nested jars there are two different mechanisms that will be
used as the path is not usable as a `java.nio.file.Path` instance.

The first is trying to convert the resulting nested path - a path
like `jar:file:....!/some/nested/path` - to a `URL` and if that should
fail due to a `MalformedURLException` it is tried to convert
the path to `URI`. If the URI fallback fails an IOException will
be thrown and this eventually will bubble up and discard the whole
classpath entry, resulting in a message like the following when
enabling verbose output during scanning:

```
2023-11-02T12:51:42.719+0100	ClassGraph	-- Skipping invalid classpath entry .../spring-boot-fully-executable-jar.jar!/BOOT-INF/lib/... : java.io.IOException: Malformed URI: ...
```

Most of the time nothing will be discarded as most paths can be converted
to a URL in the first step or at least succeed when converting to a URI.

However for paths containing spaces and the hash symbol we can reach a
case where both URL conversion and URI conversion fail and so the
classpath entry is discarded even though all paths are valid and can
be usable.

Let us assume a Spring Boot Executable JAR that is located in a directory
named `ci-build main classgraph#123` - which is a valid directory name on Windows
and Linux.

When ClassGraph reaches a nested library here it will construct the
paths to the nested jars like `jar:file:<path>!/<nested-path>`.

So in this case we end up with something like `jar:file:/opt/ci-build main classgraph#123!/BOOT-INF/lib/my-lib.jar`.

When ClassGraph reaches the conversion code it will first try to convert
to a URL. This will fail with the following message:

`java.net.MalformedURLException: no !/ in spec`

If we then fallback to the URI conversion it will try to convert but as
our path contains spaces this will also be rejected by an exception:

`java.net.URISyntaxException: Illegal character in opaque part at index 66: jar:file:...`

The index will point to the first space in the path that is converted.

So we can construct nested paths that are neither valid `URL` instances
nor valid `URI instances`.

To solve this issue we introduce encoding for spaces when the path
is handled as a url or multi-section path to ensure that conversion
can succeed. This seems to also be what the `java.nio.file.Path` API
does when asking for the resulting URI for the same path.

So this commit encodes spaces as `%20` and hash symbols as `%23` when
going into the URL/Multi-Section branch.

Fixes classgraph#804
@jwachter jwachter force-pushed the fix/nested-with-space-and-hash-path branch from 703a0d5 to 72c52de Compare November 2, 2023 15:32
@lukehutch lukehutch merged commit 07c2f49 into classgraph:latest Nov 2, 2023
@lukehutch

Copy link
Copy Markdown
Member

I really appreciate your detailed analysis on this! The change looks good to me. Thank you!

@jwachter jwachter deleted the fix/nested-with-space-and-hash-path branch October 31, 2024 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Paths with Spaces and Hash don't work when using Nested JARs

2 participants