Skip to content

josh0xA/darkdump

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

darkdump

About

Darkdump is an open-source OSINT tool for deep web investigation. Given a search query it fetches results from multiple dark web search engines, optionally scrapes each result site for emails, metadata, keywords, documents, and images, and streams everything live to either the terminal or a local browser-based interface. All results are automatically filtered against Ahmia's public abuse blacklist regardless of which engine is used.

Installation

Quick install (Linux & macOS)

git clone https://github.com/josh0xA/darkdump
cd darkdump
chmod +x install.sh
./install.sh
darkdump

Note: ./install.sh might require sudo for macOS.

This installs all dependencies and creates two commands:

Command Description
darkdump Launch the primary web interface
darkdump-cli Run the CLI tool

Manual install

git clone https://github.com/josh0xA/darkdump
cd darkdump
pip3 install -r requirements.txt
pip3 install -r darkdump-web/requirements.txt

To uninstall the launchers: ./uninstall.sh

Tor Configuration

To use Darkdump effectively, you need to configure Tor to allow your script to control it via the Tor control port. Here's how to set up your torrc file and verify that Tor is running:

Step 1: Install Tor

If Tor is not already installed on your system, you need to install it. Here's how you can install Tor on various operating systems:

Debian/Kali/Ubuntu: sudo apt install tor
MacOS: brew install tor

Step 2: Configure the Tor torrc File

Locate your torrc file. This file is usually found at /etc/tor/torrcon Linux and sometimes Mac.

Add the following lines to your torrc to enable the control port and set a control port password:

ControlPort 9051
HashedControlPassword [YourHashedPasswordHere]

Replace [YourHashedPasswordHere] with a hashed password which can be generated using the tor --hash-password command: tor --hash-password "my_password"

Step 3: Start Tor Service

Linux: sudo systemctl start tor.service
MacOS: brew services start tor

Search Engines

Six engines are available via -e / --engine:

Engine Filtered Requires Tor Notes
ahmia Yes No Default. Tor Project-endorsed, strict filtering
notevil Partial Yes Ahmia fork with broader index
tordex No Yes Fully uncensored — confirmation required
tor66 No Yes Crawled index with directory — confirmation required
onionland No Yes Indexes Tor, I2P, and clearnet — confirmation required
excavator No Yes General dark web index — confirmation required

All engines have their results checked against Ahmia's blacklist. Unfiltered engines will prompt for confirmation before executing. This strategy is still not perfect, always use extreme caution and judgement when searching and scraping unfiltered engines.

Darkdump Web Preview (Recommended)

Darkdump example

CLI Usage

darkdump-cli [-h] [-v] [-q QUERY] [-a AMOUNT] [-e ENGINE]
             [-p] [-s] [-i] [-d] [-u] [-o FILE]
             [--breach] [--breach-deep] [--breach-delay SECONDS]
Flag Description
-q, --query Search query
-a, --amount Number of results to retrieve (default: 10)
-e, --engine Engine to use (default: ahmia)
-p, --proxy Route requests through Tor
-s, --scrape Deep scrape each result for metadata, links, emails, documents
-i, --images Also collect images during scrape (requires -s)
-u, --unique Hide results with duplicate title + description
-d, --debug Enable debug output
-o FILE, --output FILE Save results to file — format inferred from extension (.json, .csv, .txt)
--breach Run a breach / credential leak intelligence scan for the given target
--breach-deep Combine breach scan with deep scraping of each result
--breach-delay Seconds between breach queries to avoid rate limits (default: 1.5)
-v, --version Print version

Examples

# Basic search via Ahmia (no Tor required)
darkdump-cli -q "privacy tools" -a 10

# Search and deep scrape each result via Tor
darkdump-cli -q "hacking" -a 10 -s -p

# Search, scrape, and collect images
darkdump-cli -q "marketplaces" -a 15 -s -p -i

# Use Not Evil engine, deduplicate, save to JSON
darkdump-cli -q "security research" -a 20 -e notevil -p -u -o results.json

# Use OnionLand engine, save to CSV
darkdump-cli -q "crypto" -a 10 -e onionland -p -o results.csv

# Breach intelligence scan for an email address
darkdump-cli --breach -q admin@example.com -e ahmia -p

# Breach scan with deep scraping
darkdump-cli --breach --breach-deep -q example.com -e ahmia

Web Interface

Darkdump includes a local browser-based interface.

# If installed via install.sh
darkdump

# Or run directly
python3 darkdump-web/app.py

Then open http://127.0.0.1:50001 in your browser.

Features

  • Live streaming results as they arrive
  • All six engines available via dropdown
  • Tor proxy toggle with live exit IP display
  • Deep scrape with optional image collection
  • Metadata-based deduplication
  • Breach intelligence scan mode
  • Export results as JSON, CSV, or TXT

Ethical Notice

This tool is intended for legitimate security research and OSINT investigations only. The developer, Josh Schiavone, is not responsible for any misuse. Do not use Darkdump to access or investigate sites engaged in illegal activity under the laws and regulations of your jurisdiction. May God bless you all.

License

MIT License — Copyright (c) Josh Schiavone

About

Open Source Intelligence Interface for Deep Web Scraping

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors