This post is a review of my notes on host collision (virtual host enumeration) – what it is, how it works, and why it still matters in nowadays.
It also doubles as a “design doc” for my tool HostCollision.
0x00 Motivation: When ports are open but the site is “missing”
Typical recon story:
- You do IP/port scanning, find lots of 80/443/8080/8443.
- You open them in a browser full of hope.
- You get 403, 404, “Welcome to nginx”, Tomcat default page, random WAF splash screens…
Clearly, something is running there, but not necessarily the app you’re after.
In modern environments, this is normal:
- Fronted by load balancers / reverse proxies / CDNs / WAFs.
- Multiple virtual hosts (vhosts) on the same IP.
- Internal or “hidden” apps routed only when the right
Hostheader appears.
This is where host collision comes in:
we abuse how HTTP/1.1 routes requests by Host to discover additional sites behind a single IP.
0x01 Quick recap: Host header and virtual hosts
1.1 Host header in HTTP/1.1
In HTTP/1.1, Host is a mandatory header:
GET / HTTP/1.1
Host: example.com
The TCP connection (IP + port) says “which machine did I connect to”,
the Host header says “which website on this machine do I want”.
1.2 How web servers use Host
Web servers (Nginx/Apache/etc.) commonly use name-based virtual hosts:
server {
listen 80;
server_name www.aaa.com;
# ...
}
server {
listen 80;
server_name www.bbb.com;
# ...
}
server {
listen 80 default_server;
server_name _;
# default / fallback vhost
}
Routing logic is roughly:
- Accept connection on
IP:80. - Parse HTTP request → read
Host: <something>. - Match
server_name/ vhost definition. - If no match → send traffic to a default vhost (often a boring page).
If admins deploy internal apps (e.g. intranet.example.com, admin.example.com) on the same front-end but don’t expose them via public DNS, they may still be reachable as long as the reverse proxy sees the right Host header.
That’s the attack surface host collision abuses.
0x02 So what exactly is “host collision”?
2.1 One-sentence definition
Host collision / virtual host fuzzing is sending HTTP requests to a fixed IP while fuzzing the
Hostheader, in order to discover additional vhosts routed through the same front-end.
Concretely:
- URL:
http://<IP>/ - Header:
Host: <some-domain>
Instead of doing “DNS brute force” (ask DNS for foo.example.com, bar.example.com…), you:
- Talk directly to the web server / reverse proxy by IP.
- Change only the
Hostheader over HTTP. - Observe which combinations produce meaningful responses.
You maintain two buckets:
- IP bucket:
ip.txt - Host bucket (domain/subdomain dictionary):
host.txt
Process:
for each ip in ip_list:
for each host in host_list:
send HTTP request:
URL = http://ip/
Host = host
record status / length / body fingerprint / similarity
If 10.0.0.5 + Host: intranet.example.com suddenly returns a valid app while all other combos return error/default pages, then:
10.0.0.5likely fronts the vhostintranet.example.com.- This vhost may be “internal-only” from a DNS perspective, but the HTTP gateway still routes to it.
0x03 Normal flow vs. host collision flow
3.1 Normal user flow
When a normal user visits https://app.example.com/:
- Browser resolves
app.example.comvia DNS. - Gets IP, say
1.2.3.4. - Connects to
1.2.3.4:443, does TLS handshake (SNI=app.example.com). - Sends HTTP request with
Host: app.example.com. - Load balancer / reverse proxy routes to the correct backend based on SNI / Host.
3.2 What host collision changes
Host collision decouples DNS from HTTP routing:
- We no longer care what DNS says.
- We only need an IP that accepts HTTP/HTTPS.
- We send
Hostvalues that the operator did not intend to expose externally.
Example:
GET / HTTP/1.1
Host: admin.internal.example.com
sent directly to 203.0.113.10 (a public IP).
If the front-end is misconfigured, it might route this to the internal admin app even though admin.internal.example.com doesn’t resolve in public DNS.
In other words:
DNS says “no such host”. HTTP routing says “sure, come in”.
That gap is exactly what we exploit.
0x04 Why this is a real security issue
In many real environments:
- A single IP / load balancer fronts dozens or hundreds of apps.
- Some apps are meant to be public; some are “internal” or “restricted”.
- “Internal” is often implemented by:
- Only putting the hostname in internal DNS.
- Maybe firewalling some sources, but not always consistently.
If all of these apps are still routed based on Host alone, then:
- Anyone who can reach the IP and guess the hostname can hit the app.
- No public DNS record ≠ no exposure.
- Certificate enumeration and DNS scraping may miss those hosts.
- Host collision can reveal a massive number of extra targets in a single IP range. wya.pl
For a pentester, missing this means:
- You see only one boring site behind an IP.
- Meanwhile there might be tens or hundreds of APIs, admin panels, debug instances behind the same IP, all accessible with the right
0x05 A practical workflow: from raw IPs to usable hits
5.1 Collect candidate IPs
Typical sources:
- Asset inventory (if you’re internal).
- External: Shodan, FOFA, Censys, etc.
- Your own masscan / nmap sweeps.
From these, keep IPs where:
- 80/443/8080/8443/etc. are open.
- Direct IP access returns:
- Default pages (
Welcome to nginx, Apache test page, etc.). - WAF/403/404.
- Very generic responses.
- Default pages (
These are strong candidates for “reverse proxies with multiple vhosts”.
5.2 Build IP and Host dictionaries
ip.txt– one IP per line (filtered candidates).host.txt– hostnames to try, from:- Subdomain enumeration (passive + brute-force).
- Wordlists (SecLists etc.).
- Historical data, internal naming conventions, leaked configs. thehacker.recipes+1
5.3 Run the host collision
Tool-agnostic logic:
- For each
(ip, host)pair:- URL:
http://ip/ - Header:
Host: host
- URL:
- Collect:
- Status code
- Response size
- Response body (for hashing / similarity)
- Duration (optional, for debugging)
This is what tools like ffuf/gobuster/wfuzz do in vhost mode as well. thehacker.recipes+1
5.4 Similarity filtering: kill the noise
Raw results are noisy:
- Default error pages
- WAF block pages
- Generic “site not configured” responses
Common trick:
- For each IP, pick a baseline response (often the first valid-looking 2xx/3xx).
- For every other
(ip, host)response:- Compute similarity score vs. baseline (e.g. shingle/Jaccard, fuzzy hash…).
- If similarity is too high, treat it as “same generic page”.
- If similarity is low, mark it as interesting.
This is the core idea behind tools like VhostFinder: virtual hosts with distinct content will diverge from the baseline.
5.5 Triage and follow-up
What you end up with after filtering:
- A relatively small set of
(ip, host)combos:- 2xx/3xx responses
- Content significantly different from baseline
- Titles that look like login pages, admin consoles, dashboards, APIs, etc.
Next steps:
- Add
ip hostmappings to/etc/hosts(for convenience). - Browse these hosts normally.
- Combine with directory brute forcing, tech fingerprinting, and standard web testing.
0x06 Host collision vs. DNS brute-forcing
They’re related, but not the same thing:
DNS brute force
- Ask DNS for
foo.example.com,bar.example.com, … - If there’s a record, you get an IP.
- No record? DNS says “NXDOMAIN”.
Host collision / vhost fuzzing
- You already have an IP.
- You send HTTP requests to that IP with different
Hostheaders. - You observe differences in HTTP responses.
Key difference:
- DNS brute forcing enumerates published names (what DNS wants you to know).
- Host collision enumerates routable names (what the HTTP stack will actually route).
Sometimes there’s a perfect overlap. In interesting cases, there isn’t – which is exactly why host collision is valuable.
0x07 Defensive notes: how not to get “collided”
From the blue-team perspective, host collision points to two underlying issues:
- Over-trusting
Hostwithout proper scoping. - Letting internal vhosts ride on public-facing front-ends.
Some practical mitigations:
- Separate public and internal vhosts
- Don’t put admin/dev/internal vhosts on the same public IP / listener as external apps.
- At least restrict them at the network level (VPN-only, office IPs, etc.).
- Tight default / fallback behavior
- For unknown
Host, return a minimal error (or drop). - Don’t route to real apps as fallback.
- Avoid verbose default pages leaking server info.
- For unknown
- Host header whitelisting
- Front-end/WAF only allows known, intended hostnames.
- Everything else → fixed error / drop.
- Regular self-scanning
- From the Internet, run your own vhost fuzzing against your IP ranges.
- Compare caught vhosts vs. intended DNS records.
- If something is routable but not in DNS, decide if it really should be reachable.
0x08 Summary
Host collision is one of those techniques that:
- Is conceptually simple;
- Leverages a very old piece of the web stack (HTTP/1.1
Host); - Still reveals surprising amounts of attack surface in modern, virtual-host-heavy environments.
The core ideas to remember:
- IP decides who you talk to;
Hostdecides who you *ask* for. - DNS is one way to map names to IPs, but HTTP routing doesn’t depend on public DNS being “truthful”.
- A single IP / load balancer can hide hundreds of apps; if you don’t check vhosts, you might miss most of your scope.
Whether you use my HostCollision or any other tool, having vhost enumeration in your recon playbook is absolutely worth it – both for offense (finding hidden assets) and for defense (discovering accidental exposures before someone else does).