Skip to content

Conversation

@mcollina
Copy link
Member

@mcollina mcollina commented Nov 28, 2025

Summary

Adds a new deduplicate interceptor for request deduplication. When enabled, concurrent identical requests are deduplicated so only one request is sent to the origin server, and all waiting handlers receive the same response.

Changes

  • New file: lib/interceptor/deduplicate.js - Standalone interceptor for request deduplication
  • New file: lib/handler/deduplication-handler.js - Handler that buffers responses for multiple waiting handlers
  • New file: test/interceptors/deduplicate.js - Comprehensive test suite (16 tests)
  • Modified: lib/util/cache.js - Added makeDeduplicationKey() function
  • Modified: index.js - Export new deduplicate interceptor
  • Modified: types/interceptors.d.ts - Added types for deduplicate interceptor
  • Modified: docs/docs/api/Dispatcher.md - Documented the feature
  • Modified: docs/docs/api/DiagnosticsChannel.md - Documented undici:request:pending-requests channel

Usage

const { Client, interceptors } = require("undici");
const { deduplicate, cache } = interceptors;

// Deduplicate only
const client = new Client("http://example.com").compose(
  deduplicate()
);

// Deduplicate with caching
const clientWithCache = new Client("http://example.com").compose(
  deduplicate(),
  cache()
);

Options

  • methods - The safe HTTP methods to deduplicate. Default ['GET'].

Observability

A diagnostic channel undici:request:pending-requests is available for monitoring:

const diagnosticsChannel = require('node:diagnostics_channel');

diagnosticsChannel.channel('undici:request:pending-requests').subscribe(({ type, size, key }) => {
  console.log(type);  // 'added' or 'removed'
  console.log(size);  // current number of pending requests
  console.log(key);   // the deduplication key
});

Test plan

  • All existing cache tests pass (39 tests)
  • New deduplication tests pass (16 tests)
  • Tests cover: basic deduplication, header handling, different paths, error propagation, cache integration, chunked bodies, diagnostic channel events, Authorization/Cookie header differentiation
  • Lint passes

🤖 Generated with Claude Code

Implements request deduplication for the cache interceptor to prevent
multiple identical concurrent requests from being sent to the origin
server. When multiple requests for the same resource are made while
one is already in-flight, subsequent requests wait for the original
request to complete and share its response.

This optimization reduces unnecessary network traffic and server load
when handling concurrent identical requests, particularly useful in
scenarios where the same resource is requested multiple times before
the cache is populated.

Changes:
- Add CacheDeduplicationHandler to manage multiple waiting handlers
- Add makeDeduplicationKey utility for generating request keys
- Integrate deduplication logic into cache interceptor
- Add comprehensive tests for deduplication scenarios

Signed-off-by: Matteo Collina <hello@matteocollina.com>
- Add 'undici:cache:pending-requests' diagnostic channel
- Publish events when pending requests are added/removed
- Add tests for error cleanup and map state verification
- Replace callback approach with Node.js diagnostics channel

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
- Document request deduplication feature in cache interceptor section
- Add undici:cache:pending-requests diagnostic channel documentation
- Include examples for monitoring deduplication behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina mcollina marked this pull request as ready for review November 28, 2025 22:09
@mcollina
Copy link
Member Author

I had claude opus 4.5 prepare this. Looks good enough but a second opionion would be interesting.

I opted to not put the dedupe burden on the cache storage. This means that in a distributed cache scenario, you’ll
see n invocations, one per event loop.

Creating the deduplication cache key is relatively expensive, so this
feature is now disabled by default and can be enabled with the
`deduplication: true` option.

- Add `deduplication` option to cache interceptor (default: false)
- Update TypeScript types with new option
- Update documentation to reflect opt-in behavior
- Update all tests to explicitly enable deduplication

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@codecov-commenter
Copy link

codecov-commenter commented Nov 28, 2025

Codecov Report

❌ Patch coverage is 94.94382% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.85%. Comparing base (fdafc2a) to head (b154b6b).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
lib/handler/deduplication-handler.js 95.37% 10 Missing ⚠️
lib/interceptor/deduplicate.js 92.66% 8 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4679      +/-   ##
==========================================
- Coverage   92.86%   92.85%   -0.01%     
==========================================
  Files         107      109       +2     
  Lines       33499    33809     +310     
==========================================
+ Hits        31108    31395     +287     
- Misses       2391     2414      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines 378 to 384
if (cacheKey.headers) {
const sortedHeaders = Object.keys(cacheKey.headers).sort()
for (const header of sortedHeaders) {
const value = cacheKey.headers[header]
key += `:${header}=${Array.isArray(value) ? value.join(',') : value}`
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot find the place where we strip sensitive headers (e.g. authorization), it might be good to remove it from plain text to be used for deduplication key.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would you remove them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly to avoid keeping it in memory plain text, but happy to keep it as is if we feel no possible harm can be done

* Map of pending requests for deduplication (null if deduplication is disabled)
* @type {Map<string, CacheDeduplicationHandler> | null}
*/
const pendingRequests = deduplication ? new Map() : null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we worried that it can go out of bounds?

Verify that requests with different Authorization or Cookie headers
are NOT deduplicated (since they may return different responses for
different users), while requests with the same per-user headers ARE
deduplicated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
@mcollina
Copy link
Member Author

I wonder if a better approach would be to have a completely separate interceptor instead.

@metcoder95
Copy link
Member

But doesn't this depends on the cache interceptor?

@mcollina
Copy link
Member Author

mcollina commented Dec 4, 2025

@metcoder95 not much tbh.

@metcoder95
Copy link
Member

Then yeah, better to have it as a separated interceptor

- Create new `deduplicate` interceptor at lib/interceptor/deduplicate.js
- Rename CacheDeduplicationHandler to DeduplicationHandler
- Remove deduplication logic from cache interceptor
- Update diagnostic channel name to `undici:request:pending-requests`
- Update documentation and TypeScript types

BREAKING CHANGE: Request deduplication is now a separate interceptor.
Use `interceptors.deduplicate()` instead of `cache({ deduplication: true })`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
const server = createServer({ joinDuplicateHeaders: true }, async (req, res) => {
requestsToOrigin++
await sleep(100)
res.end(`response for ${req.url}`)

Check failure

Code scanning / CodeQL

Reflected cross-site scripting High test

Cross-site scripting vulnerability due to a
user-provided value
.

Copilot Autofix

AI about 1 month ago

The problem is that user-controlled input (req.url) is reflected directly in the HTTP response. The correct fix is to escape this value for safe inclusion in the response, using a standardized escaping function, before sending it. The most straightforward and robust solution in Node.js is to use the escape-html npm package, as shown in the background material. This package will encode special HTML characters, neutralizing embedded scripts.

Steps to fix:

  • Import the escape-html package at the top of the file.
  • Wherever a user input (here, req.url) is incorporated into the response, wrap it with the escape function.
  • The fix must be applied on line 162 in the function that sends a response using `response for ${req.url}`.
  • No other changes are required.

Suggested changeset 2
test/interceptors/deduplicate.js

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/test/interceptors/deduplicate.js b/test/interceptors/deduplicate.js
--- a/test/interceptors/deduplicate.js
+++ b/test/interceptors/deduplicate.js
@@ -2,6 +2,7 @@
 
 const { createServer } = require('node:http')
 const { describe, test, after } = require('node:test')
+const escape = require('escape-html');
 const { once } = require('node:events')
 const { strictEqual } = require('node:assert')
 const { setTimeout: sleep } = require('node:timers/promises')
@@ -159,7 +160,7 @@
     const server = createServer({ joinDuplicateHeaders: true }, async (req, res) => {
       requestsToOrigin++
       await sleep(100)
-      res.end(`response for ${req.url}`)
+      res.end(`response for ${escape(req.url)}`)
     }).listen(0)
 
     const client = new Client(`http://localhost:${server.address().port}`)
EOF
@@ -2,6 +2,7 @@

const { createServer } = require('node:http')
const { describe, test, after } = require('node:test')
const escape = require('escape-html');
const { once } = require('node:events')
const { strictEqual } = require('node:assert')
const { setTimeout: sleep } = require('node:timers/promises')
@@ -159,7 +160,7 @@
const server = createServer({ joinDuplicateHeaders: true }, async (req, res) => {
requestsToOrigin++
await sleep(100)
res.end(`response for ${req.url}`)
res.end(`response for ${escape(req.url)}`)
}).listen(0)

const client = new Client(`http://localhost:${server.address().port}`)
package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/package.json b/package.json
--- a/package.json
+++ b/package.json
@@ -148,5 +148,8 @@
     "testMatch": [
       "<rootDir>/test/jest/**"
     ]
-  }
+  },
+  "dependencies": {
+    "escape-html": "^1.0.3"
 }
+}
EOF
@@ -148,5 +148,8 @@
"testMatch": [
"<rootDir>/test/jest/**"
]
}
},
"dependencies": {
"escape-html": "^1.0.3"
}
}
This fix introduces these dependencies
Package Version Security advisories
escape-html (npm) 1.0.3 None
Copilot is powered by AI and may make mistakes. Always verify output.
@mcollina mcollina changed the title feat: add request deduplication to cache interceptor feat: add deduplicate interceptor for request deduplication Dec 6, 2025
Copy link
Member

@metcoder95 metcoder95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just left a comment about potentially using sensitive headers as part of the plain text key

@mcollina
Copy link
Member Author

if we didn't include sensitive headers, then requests with different Authorization would be deduplicated, causing security incidents.

I'll add an option to not deduplicate requests with certain headers.

Add two new options to the deduplicate interceptor:

- skipHeaderNames: Header names that, if present in a request, will
  cause the request to skip deduplication entirely. Useful for headers
  like idempotency-key where presence indicates unique processing.

- excludeHeaderNames: Header names to exclude from the deduplication
  key. Requests with different values for these headers will still be
  deduplicated together. Useful for headers like x-request-id that
  vary per request but shouldn't affect deduplication.

Both options use Sets internally for efficient lookups and support
case-insensitive header name matching.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Matteo Collina <hello@matteocollina.com>
Copy link
Member

@gurgunday gurgunday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcollina mcollina merged commit b902551 into main Dec 16, 2025
32 of 35 checks passed
@mcollina mcollina deleted the feat/cache-request-deduplication branch December 16, 2025 22:39
@github-actions github-actions bot mentioned this pull request Jan 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants