-
-
Notifications
You must be signed in to change notification settings - Fork 689
feat: add deduplicate interceptor for request deduplication #4679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Implements request deduplication for the cache interceptor to prevent multiple identical concurrent requests from being sent to the origin server. When multiple requests for the same resource are made while one is already in-flight, subsequent requests wait for the original request to complete and share its response. This optimization reduces unnecessary network traffic and server load when handling concurrent identical requests, particularly useful in scenarios where the same resource is requested multiple times before the cache is populated. Changes: - Add CacheDeduplicationHandler to manage multiple waiting handlers - Add makeDeduplicationKey utility for generating request keys - Integrate deduplication logic into cache interceptor - Add comprehensive tests for deduplication scenarios Signed-off-by: Matteo Collina <hello@matteocollina.com>
- Add 'undici:cache:pending-requests' diagnostic channel - Publish events when pending requests are added/removed - Add tests for error cleanup and map state verification - Replace callback approach with Node.js diagnostics channel 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Matteo Collina <hello@matteocollina.com>
- Document request deduplication feature in cache interceptor section - Add undici:cache:pending-requests diagnostic channel documentation - Include examples for monitoring deduplication behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Matteo Collina <hello@matteocollina.com>
|
I had claude opus 4.5 prepare this. Looks good enough but a second opionion would be interesting. I opted to not put the dedupe burden on the cache storage. This means that in a distributed cache scenario, you’ll |
Creating the deduplication cache key is relatively expensive, so this feature is now disabled by default and can be enabled with the `deduplication: true` option. - Add `deduplication` option to cache interceptor (default: false) - Update TypeScript types with new option - Update documentation to reflect opt-in behavior - Update all tests to explicitly enable deduplication 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Matteo Collina <hello@matteocollina.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4679 +/- ##
==========================================
- Coverage 92.86% 92.85% -0.01%
==========================================
Files 107 109 +2
Lines 33499 33809 +310
==========================================
+ Hits 31108 31395 +287
- Misses 2391 2414 +23 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| if (cacheKey.headers) { | ||
| const sortedHeaders = Object.keys(cacheKey.headers).sort() | ||
| for (const header of sortedHeaders) { | ||
| const value = cacheKey.headers[header] | ||
| key += `:${header}=${Array.isArray(value) ? value.join(',') : value}` | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cannot find the place where we strip sensitive headers (e.g. authorization), it might be good to remove it from plain text to be used for deduplication key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would you remove them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly to avoid keeping it in memory plain text, but happy to keep it as is if we feel no possible harm can be done
lib/interceptor/cache.js
Outdated
| * Map of pending requests for deduplication (null if deduplication is disabled) | ||
| * @type {Map<string, CacheDeduplicationHandler> | null} | ||
| */ | ||
| const pendingRequests = deduplication ? new Map() : null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we worried that it can go out of bounds?
Verify that requests with different Authorization or Cookie headers are NOT deduplicated (since they may return different responses for different users), while requests with the same per-user headers ARE deduplicated. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Matteo Collina <hello@matteocollina.com>
|
I wonder if a better approach would be to have a completely separate interceptor instead. |
|
But doesn't this depends on the cache interceptor? |
|
@metcoder95 not much tbh. |
|
Then yeah, better to have it as a separated interceptor |
- Create new `deduplicate` interceptor at lib/interceptor/deduplicate.js
- Rename CacheDeduplicationHandler to DeduplicationHandler
- Remove deduplication logic from cache interceptor
- Update diagnostic channel name to `undici:request:pending-requests`
- Update documentation and TypeScript types
BREAKING CHANGE: Request deduplication is now a separate interceptor.
Use `interceptors.deduplicate()` instead of `cache({ deduplication: true })`.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
| const server = createServer({ joinDuplicateHeaders: true }, async (req, res) => { | ||
| requestsToOrigin++ | ||
| await sleep(100) | ||
| res.end(`response for ${req.url}`) |
Check failure
Code scanning / CodeQL
Reflected cross-site scripting High test
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
The problem is that user-controlled input (req.url) is reflected directly in the HTTP response. The correct fix is to escape this value for safe inclusion in the response, using a standardized escaping function, before sending it. The most straightforward and robust solution in Node.js is to use the escape-html npm package, as shown in the background material. This package will encode special HTML characters, neutralizing embedded scripts.
Steps to fix:
- Import the
escape-htmlpackage at the top of the file. - Wherever a user input (here,
req.url) is incorporated into the response, wrap it with theescapefunction. - The fix must be applied on line 162 in the function that sends a response using
`response for ${req.url}`. - No other changes are required.
-
Copy modified line R5 -
Copy modified line R163
| @@ -2,6 +2,7 @@ | ||
|
|
||
| const { createServer } = require('node:http') | ||
| const { describe, test, after } = require('node:test') | ||
| const escape = require('escape-html'); | ||
| const { once } = require('node:events') | ||
| const { strictEqual } = require('node:assert') | ||
| const { setTimeout: sleep } = require('node:timers/promises') | ||
| @@ -159,7 +160,7 @@ | ||
| const server = createServer({ joinDuplicateHeaders: true }, async (req, res) => { | ||
| requestsToOrigin++ | ||
| await sleep(100) | ||
| res.end(`response for ${req.url}`) | ||
| res.end(`response for ${escape(req.url)}`) | ||
| }).listen(0) | ||
|
|
||
| const client = new Client(`http://localhost:${server.address().port}`) |
-
Copy modified lines R151-R153 -
Copy modified line R155
| @@ -148,5 +148,8 @@ | ||
| "testMatch": [ | ||
| "<rootDir>/test/jest/**" | ||
| ] | ||
| } | ||
| }, | ||
| "dependencies": { | ||
| "escape-html": "^1.0.3" | ||
| } | ||
| } |
| Package | Version | Security advisories |
| escape-html (npm) | 1.0.3 | None |
metcoder95
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, just left a comment about potentially using sensitive headers as part of the plain text key
|
if we didn't include sensitive headers, then requests with different Authorization would be deduplicated, causing security incidents. I'll add an option to not deduplicate requests with certain headers. |
Add two new options to the deduplicate interceptor: - skipHeaderNames: Header names that, if present in a request, will cause the request to skip deduplication entirely. Useful for headers like idempotency-key where presence indicates unique processing. - excludeHeaderNames: Header names to exclude from the deduplication key. Requests with different values for these headers will still be deduplicated together. Useful for headers like x-request-id that vary per request but shouldn't affect deduplication. Both options use Sets internally for efficient lookups and support case-insensitive header name matching. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Matteo Collina <hello@matteocollina.com>
gurgunday
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Summary
Adds a new
deduplicateinterceptor for request deduplication. When enabled, concurrent identical requests are deduplicated so only one request is sent to the origin server, and all waiting handlers receive the same response.Changes
lib/interceptor/deduplicate.js- Standalone interceptor for request deduplicationlib/handler/deduplication-handler.js- Handler that buffers responses for multiple waiting handlerstest/interceptors/deduplicate.js- Comprehensive test suite (16 tests)lib/util/cache.js- AddedmakeDeduplicationKey()functionindex.js- Export newdeduplicateinterceptortypes/interceptors.d.ts- Added types for deduplicate interceptordocs/docs/api/Dispatcher.md- Documented the featuredocs/docs/api/DiagnosticsChannel.md- Documentedundici:request:pending-requestschannelUsage
Options
methods- The safe HTTP methods to deduplicate. Default['GET'].Observability
A diagnostic channel
undici:request:pending-requestsis available for monitoring:Test plan
🤖 Generated with Claude Code