Skip to content

Add configurable DNS parallelization & round-robin#156

Merged
CorentinB merged 3 commits into
masterfrom
feat/configurable-dns-limits
Nov 13, 2025
Merged

Add configurable DNS parallelization & round-robin#156
CorentinB merged 3 commits into
masterfrom
feat/configurable-dns-limits

Conversation

@CorentinB

@CorentinB CorentinB commented Nov 12, 2025

Copy link
Copy Markdown
Collaborator

This pull request introduces a configurable DNS concurrency feature to the HTTP client, allowing DNS queries to be performed either sequentially or in parallel, improving DNS resolution performance and flexibility. It refactors the DNS resolution logic to support early cancellation when results are found, and updates the client and dialer code to handle the new concurrency option. Additionally, minor improvements and dependency updates are included.

Note: the current behavior is kinda weird, we can add as many DNS servers as we want when configuring the client, but only 4 at most are being used. It felt like a very weird behavior (my fault) so I made this.

How round-robin and concurrency work together

The Key: Round-Robin is BETWEEN requests, not within

What Round-Robin does

Round-robin only determines which server to start with for each new DNS request. It rotates the starting position across different requests.

What happens within a single request

Let's say you have 5 DNS servers [A, B, C, D, E] and DNSConcurrency = 3:

Request 1 (startIdx = 0):

  • Server queue created: [A, B, C, D, E]
  • This queue goes into a channel
  • 3 workers are spawned (line 107)
  • Worker 1 pulls A from channel → queries server A
  • Worker 2 pulls B from channel → queries server B
  • Worker 3 pulls C from channel → queries server C
  • All 3 queries happen at the same time ← This is the concurrency
  • When Worker 1 finishes with A, it pulls D from channel
  • When Worker 2 finishes with B, it pulls E from channel
  • When Worker 3 finishes with C, channel is empty

So within Request 1, you're trying servers A, B, C concurrently (not round-robin).

Request 2 (different request, startIdx = 1):

  • Server queue created: [B, C, D, E, A] ← Starting position rotated
  • 3 workers query B, C, D concurrently

Request 3 (startIdx = 2):

  • Server queue created: [C, D, E, A, B]
  • 3 workers query C, D, E concurrently

The Pattern

  • Within a single request: Servers are tried concurrently (worker pool pattern)
  • Across multiple requests: Starting position rotates (round-robin)

The round-robin ensures that over many requests, all servers get equal load, even though within each individual request you're hitting multiple servers at once.

DNS Concurrency Configuration & Logic

  • Added a new DNSConcurrency field to HTTPClientSettings and passed it through the client and dialer constructors, enabling users to control how many DNS servers are queried in parallel. [1] [2] [3]
  • Implemented the concurrentDNSLookup method in customDialer, which performs DNS queries with the specified concurrency, supports unlimited concurrency, and cancels remaining queries early when results are found. [1] [2]

Code Structure & Refactoring

  • Refactored the DNS resolution logic in archiveDNS to use the new concurrent lookup method, replacing the previous fallback and goroutine approach.
  • Updated network address selection logic to use switch statements for clarity and maintainability in both TCP and UDP cases. [1] [2]
  • Introduced the dnsExchanger interface for DNS clients, improving abstraction and testability.

Dependency Updates

  • Updated Go version to 1.25.4 and added golang.org/x/sys v0.37.0 as a direct dependency. [1] [2] [3]

Testing Improvements

  • Improved error handling in the testFileEarlyEOF test to ensure decompression errors are caught and reported, and added missing error checks for robustness. [1] [2]

@CorentinB CorentinB requested a review from Copilot November 12, 2025 13:46
@CorentinB CorentinB self-assigned this Nov 12, 2025
@CorentinB CorentinB added the enhancement New feature or request label Nov 12, 2025
@CorentinB CorentinB changed the title Add configurable DNS settings & round-robin DNS Add configurable DNS settings & round-robin Nov 12, 2025
@CorentinB CorentinB requested a review from NGTmeaty November 12, 2025 13:46
@CorentinB CorentinB changed the title Add configurable DNS settings & round-robin Add configurable DNS parallelization & round-robin Nov 12, 2025

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces configurable DNS concurrency to improve DNS resolution performance and flexibility. The changes enable users to control whether DNS servers are queried sequentially or in parallel, with support for early cancellation when results are found. The implementation includes round-robin DNS server selection and refactored network address selection logic.

  • Adds DNS concurrency configuration with sequential, parallel (limited), and unlimited modes
  • Implements round-robin DNS server selection using atomic counters
  • Refactors DNS resolution logic with early cancellation support

Reviewed Changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
go.mod Updates Go version and moves golang.org/x/sys to direct dependency
client.go Adds DNSConcurrency field to HTTPClientSettings and passes it to dialer constructor
dialer.go Adds dnsExchanger interface, dnsConcurrency field, and dnsRoundRobinIndex for round-robin server selection; refactors network address selection to use switch statements
dns.go Removes hardcoded DNS server limit and implements configurable concurrent DNS lookup with early cancellation
dns_test.go Adds mock DNS client and comprehensive tests for DNS concurrency modes, round-robin, early cancellation, and IPv4/IPv6 handling
read_test.go Adds missing error check for decompression reader creation and formatting improvements
main_test.go Fixes indentation from spaces to tabs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread dns.go
Comment thread dns_test.go
The drainer goroutine in setupMock was calling both wg.Go() and defer wg.Done(),
causing Done() to be called twice and resulting in a negative WaitGroup counter panic.
@CorentinB CorentinB requested a review from yzqzss November 12, 2025 14:10

@NGTmeaty NGTmeaty left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review looks good! I'd like to test it more tomorrow, but otherwise LGTM!

Comment thread dns.go
var ipv4Errors, ipv6Errors []error
for res := range resultChan {
if res.err == nil {
if res.recordType == dns.TypeA && ipv4 == nil {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we have the records here, do we have any interest in checking if they are the same?

* feat: optimize DNS resolution to eliminate duplicate queries

Problem:
- DNS was resolved twice for every connection
- First by gowarc (archiveDNS) for archiving/caching
- Then by Go stdlib/proxy for actual connection
- Result: wasted resources, duplicate DNS queries

Solution:
- Track if proxy requires hostname (socks5h, http) vs IP (socks5)
- Use gowarc's resolved IP for direct connections and socks5 proxies
- Pass hostname only to socks5h/http proxies (remote DNS)

Changes:
- Add proxyNeedsHostname bool to customDialer struct
- Detect proxy scheme in newCustomDialer() to set flag
- Extract port and construct IP:port in CustomDialContext()
- Extract port and construct IP:port in CustomDialTLSContext()
- Use resolved IP when proxy doesn't need hostname OR no proxy

Benefits:
- Single DNS query per connection (50% reduction)
- Respects proxy DNS semantics (socks5h = remote DNS)
- Preserves WARC DNS archiving functionality
- Improves performance and reduces DNS server load

* feat: skip DNS archiving for remote DNS proxies to prevent privacy leaks

When using remote DNS proxies (socks5h, socks4a, http, https), DNS
archiving is now skipped to avoid privacy leaks and ensure accuracy.

Changes:
- Modified CustomDialContext() and CustomDialTLSContext() to conditionally
  skip archiveDNS() call when using remote DNS proxies
- For remote DNS proxies: hostname is sent directly to proxy (no local DNS)
- For direct/local DNS proxies: DNS is resolved and archived as before
- Added comprehensive documentation in README.md explaining DNS behavior
  for different proxy types

Benefits:
1. Privacy: No local DNS queries when using socks5h (prevents DNS leak)
2. Accuracy: WARC reflects actual connection (no local DNS mismatch)
3. Performance: Eliminates unnecessary DNS query for remote DNS proxies

Trade-off: No DNS WARC records are created for remote DNS proxy connections,
which is acceptable since the proxy handles DNS resolution remotely and
local DNS results may differ from the proxy's resolution.

This change is part of the DNS optimization work to eliminate duplicate
DNS queries and improve privacy when using remote DNS proxies.

@NGTmeaty NGTmeaty left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CorentinB CorentinB merged commit 7b5340a into master Nov 13, 2025
5 checks passed
@CorentinB CorentinB deleted the feat/configurable-dns-limits branch November 13, 2025 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants