fix(url): ignore fragment identifiers in getPath()#4627
fix(url): ignore fragment identifiers in getPath()#4627yusukebe merged 5 commits intohonojs:mainfrom
Conversation
In Service Worker contexts, URLs may contain fragment identifiers (#) which should not be included in route matching. This change modifies getPath() to strip fragment identifiers from the path, consistent with how query strings are already handled. Fixes honojs#4440
When a URL contains a fragment with query-like characters (e.g., #section?foo=bar),
the previous logic would incorrectly include the fragment in the path because it
checked for '?' first and only looked for '#' if '?' was not found.
This fix uses Math.min() to find whichever delimiter ('?' or '#') comes first,
correctly handling edge cases per RFC 3986 where '#' terminates the path regardless
of subsequent '?' characters in the fragment.
Added test case for this edge case.
Fixed: Percent-encoded path with fragment containing query-like charactersThe previous implementation had a bug where it would check for let separator = url.indexOf('?', i)
if (separator === -1) {
separator = url.indexOf('#', i)
}This caused incorrect behavior when a URL contains a fragment with query-like characters (e.g., FixChanged to find whichever delimiter ( const queryIndex = url.indexOf('?', i)
const hashIndex = url.indexOf('#', i)
const separator =
queryIndex === -1 ? hashIndex : hashIndex === -1 ? queryIndex : Math.min(queryIndex, hashIndex)Added Testit('getPath - with percent encoding and fragment containing query-like chars', () => {
const path = getPath(new Request('https://example.com/hello%20world#section?foo=bar'))
expect(path).toBe('/hello world')
})All tests pass. |
Verify that percent-encoded hash (%23) is preserved while real fragment identifiers are correctly stripped. decodeURI preserves reserved characters per Web Standards.
usualoma
left a comment
There was a problem hiding this comment.
Thank you for creating the pull request! I think the direction is good.
With the current code, === -1 is evaluated three times for /path?x=1, so I think the following is better:
const end =
queryIndex === -1
? hashIndex === -1
? undefined
: hashIndex
: hashIndex === -1
? queryIndex
: Math.min(queryIndex, hashIndex)
const path = url.slice(start, end)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4627 +/- ##
==========================================
- Coverage 91.44% 91.44% -0.01%
==========================================
Files 173 173
Lines 11321 11330 +9
Branches 3280 3286 +6
==========================================
+ Hits 10353 10361 +8
- Misses 967 968 +1
Partials 1 1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Restructure ternary to avoid redundant separator === -1 check. For /path?x=1, reduces evaluations from 3 to 2. Co-authored-by: usualoma <usualoma@users.noreply.github.com>
Cover the undefined branch when both queryIndex and hashIndex are -1.
|
Thank you for the review and the optimization suggestion! I've updated the code as you suggested. The restructured ternary now avoids the redundant I also added a test case to cover the (Sorry if my English is a bit rough - it's not my native language!) |
| const queryIndex = url.indexOf('?', i) | ||
| const path = url.slice(start, queryIndex === -1 ? undefined : queryIndex) | ||
| const hashIndex = url.indexOf('#', i) | ||
| const end = |
There was a problem hiding this comment.
How about the following implementation? This is more short and readable.
const end = Math.min(
queryIndex === -1 ? url.length : queryIndex,
hashIndex === -1 ? url.length : hashIndex
)This runs Math.min every time, but in some cases that contain percent encoding, the path has a query.
@sano-suguru @usualoma What do you think of it?
There was a problem hiding this comment.
@yusukebe
Sorry I missed that. Either one seems fine.
The character count after minification is probably the same, right?
Math.min(a===-1?u.length:a,b===-1?u.length:b)a===-1?b===-1?void 0:b:b===-1?a:Math.min(a,b)Performance-wise, there probably won't be a noticeable difference in production. I prefer the current implementation, but either one is fine.
There was a problem hiding this comment.
Sorry I missed that. Either one seems fine.
No problem!
The character count after minification is probably the same, right?
Ah, yes. So it's just for readability. If you feel both are okay, I think we should keep the current implementation.
|
I've left a small comment, but let's go with this implementation. Looks good. Thanks! |
Summary
getPath()to strip fragment identifiers (#) from URLsFixes #4440
Changes
In Service Worker contexts, URLs may contain fragment identifiers (e.g.,
/users/1#profile-section) which were incorrectly included in route parameters. This change modifiesgetPath()to strip fragment identifiers, consistent with how query strings are already handled.Before
After
Implementation
Based on the approach suggested by @usualoma in #4440:
#(charCode 35) to the loop termination conditionTest plan