Skip to content

taylorwilsdon/reddacted

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

484 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ reddacted

AI-Powered Reddit Privacy Suite

Privacy Shield AI Analysis GitHub License PyPI - Version PyPI Downloads

Local LLM powered, highly performant privacy analysis leveraging AI, sentiment analysis & PII detection
to provide insights into your true privacy with bulk remediation

For aging engineers who want to protect their future political careers πŸ›οΈ

reddacted demo
reddactive_interactive_config.mov

✨ Key Features

πŸ›‘οΈ
PII Detection
Analyze the content of comments to identify anything that might reveal PII that you may not want correlated with your anonymous username
🀫
Sentiment Analysis
Understand the emotional tone of your Reddit history, combined with upvote/downvote counts & privacy risks to choose which posts to reddact
πŸ”’
Zero-Trust Architecture
Client-side execution only, no data leaves your machine unless you choose to use a hosted API. Fully compatible with all OpenAI compatible endpoints
⚑
Self-Host Ready
Use any model via Ollama, llama.cpp, vLLM or other platform capable of exposing an OpenAI-compatible endpoint. LiteLLM works just dandy.
πŸ“Š
Smart Cleanup
Preserve valuable contributions while removing risky content - clean up your online footprint without blowing away everything

πŸ” Can I trust this with my data?

You don't have to - read the code for yourself, only reddit is called

# Run with local LLM - you'll be guided through configuration
reddacted user yourusername
  • βœ… Client-side execution only, no tracking or external calls
  • βœ… Session-based authentication if you choose - it is optional unless you want to delete
  • βœ… Keep your nonsense comments with lots of upvotes and good vibes without unintentionally doxing yourself
  • βœ… All configuration stored locally in config.json
# Quick analysis with custom limit
reddacted user taylorwilsdon --limit 3

πŸ“‹ Table of Contents

πŸ“₯ Installation

# Install from brew (recommended)
brew install taylorwilsdon/tap/reddacted

# Install from PyPI (recommended)
pip install reddacted

# Or install from source
git clone https://github.com/taylorwilsdon/reddacted.git
cd reddacted
pip install -e ".[dev]"  # Installs with development dependencies

πŸš€ Usage

reddacted now features a guided configuration flow that makes setup easy. Simply run any command and you'll be prompted to configure your settings through an interactive interface:

# Most basic possible quick start - launches the guided configuration flow
reddacted user spez

# The guided flow will prompt you to:
# - Choose between OpenAI or local LLM
# - Enter your API key or local LLM URL
# - Select your model from available options
# - Configure authentication settings
# - Set analysis preferences (limit, sort, time filter, etc.)
# - Save your configuration for future use

Configuration Options

The interactive configuration flow includes:

  • LLM Settings: Choose between OpenAI API or local LLM endpoint (like Ollama)
  • Authentication: Enable Reddit API authentication if needed
  • Analysis Options: Set comment limits, sort order, time filters
  • Output Options: Configure file output, PII filtering preferences
  • Advanced Settings: Text matching patterns, batch sizes for bulk operations

Your configuration is automatically saved to config.json for reuse.

Example Commands

Once configured, you can run commands like:

# Analyze a user's recent comments (uses saved config)
reddacted user spez

# Analyze a specific subreddit post
reddacted listing r/privacy abc123

# Bulk comment management
reddacted delete abc123,def456  # Delete comments
reddacted update abc123,def456  # Replace with standard redaction message

Override Configuration

You can still override saved settings with command-line arguments:

# Override the saved limit
reddacted user spez --limit 50

# Use a different model temporarily
reddacted user spez --model "gpt-4-turbo"

# Enable authentication for this run only
reddacted user spez --enable-auth

Available Commands

Command Description
user Analyze a user's comment history
listing Analyze a specific post and its comments
delete Delete comments by their IDs
update Replace comment content with r/reddacted

Common Arguments

Argument Description
--limit N Maximum comments to analyze (default: 100, 0 for unlimited)
--sort Sort method: hot, new, controversial, top (default: new)
--time Time filter: all, day, hour, month, week, year (default: all)
--output-file Save detailed analysis to a file
--enable-auth Enable Reddit API authentication
--disable-pii Skip PII detection
--pii-only Show only comments containing PII
--text-match Search for comments containing specific text
--skip-text Skip comments containing specific text pattern
--batch-size Comments per batch for delete/update (default: 10)
--use-random-string Use random UUID instead of standard message when updating comments

LLM Configuration

The guided configuration flow will help you set up your LLM preferences. You can choose between:

  1. Local LLM (Ollama, vLLM, etc.):

    • Default endpoint: http://localhost:11434
    • Automatically fetches available models
    • No API key required
  2. OpenAI API:

    • Enter your OpenAI API key
    • Select from available OpenAI models
    • Supports custom API base URLs

Configuration values are saved to config.json and can be overridden with command-line flags:

Flag Description
--local-llm URL Override local LLM endpoint
--openai-key KEY Override OpenAI API key
--model NAME Override model selection
Note: Environment variables are also supported:
export OPENAI_API_KEY="your-api-key"
export REDDIT_USERNAME="your-username"
export REDDIT_PASSWORD="your-password"
export REDDIT_CLIENT_ID="your-client-id"
export REDDIT_CLIENT_SECRET="your-client-secret"

These will be automatically loaded if present.

❓ How accurate is the PII detection, really?

Surprisingly good. Good enough that I run it against my own stuff in delete mode. It's basically a defense-in-depth approach combining these methods:

πŸ“Š AI Detection

Doesn't need a crazy smart model, don't waste your money on r1 or o1.

  • Cheap & light models like qwen3:8b, gpt-4.1-nano, qwen2.5:7b, Mistral SSmall or gemma3:14b are all plenty
  • Don't use something too dumb or it will be inconsistent, a 0.5b model will produce unreliable results
  • Works fine with cheap models like qwen2.5:3b (potato can run this) and gpt-4o-mini (~15Β’ per million tokens), but gets better with 7b and up

πŸ” Pattern Matching

50+ regex rules for common PII formats does a first past sweep for the obvious stuff

🧠 Context Analysis

Are you coming off as a dick? Perhaps that factors into your decision to clean up. Who could say, mine are all smiley faces.

πŸ’‘ FAQ

Q: How does the AI handle false positives?

Adjust confidence threshold (default 0.7) per risk tolerance. You're building a repo from source off some random dude's github - don't run this and just delete a bunch of stuff blindly, you're a smart person. Review your results, and if it is doing something crazy, please tell me.

Q: What LLMs are supported?

Local: any model via Ollama, vLLM or other platform capable of exposing an openai-compatible endpoint.
Cloud: OpenAI-compatible endpoints

Q: Is my data sent externally?

If you choose to use a hosted provider, yes - in cloud mode - local analysis stays fully private.

πŸ”§ Troubleshooting

If you get "command not found" after installation:

  1. Check Python scripts directory is in your PATH:
# Typical Linux/Mac location
export PATH="$HOME/.local/bin:$PATH"

# Typical Windows location
set PATH=%APPDATA%\Python\Python311\Scripts;%PATH%
  1. Verify installation location:
pip show reddacted

πŸ”‘ Authentication

Before running any commands that require authentication, you'll need to set up your Reddit API credentials:

Step 1: Create a Reddit Account

If you don't have one, sign up at https://www.reddit.com/account/register/

Step 2: Create a Reddit App

  • Go to https://www.reddit.com/prefs/apps
  • Click "are you a developer? create an app..." at the bottom
  • Choose "script" as the application type
  • Set "reddacted" as both the name and description
  • Use "http://localhost:8080" as the redirect URI
  • Click "create app"

Step 3: Get Your Credentials

After creating the app, note down:

  • Client ID: The string under "personal use script"
  • Client Secret: The string labeled "secret"

Step 4: Set Environment Variables

export REDDIT_USERNAME=your-reddit-username
export REDDIT_PASSWORD=your-reddit-password
export REDDIT_CLIENT_ID=your-client-id
export REDDIT_CLIENT_SECRET=your-client-secret

These credentials are also automatically used if all environment variables are present, even without the --enable-auth flag.

πŸ§™β€β™‚οΈ Advanced Usage

Text Filtering

You can filter comments using these arguments:

Argument Description
--text-match "search phrase" Only analyze comments containing specific text (requires authentication)
--skip-text "skip phrase" Skip comments containing specific text pattern

For example:

# Only analyze comments containing "python"
reddacted user spez --text-match "python"

# Skip comments containing "deleted"
reddacted user spez --skip-text "deleted"

# Combine both filters
reddacted user spez --text-match "python" --skip-text "deleted"

πŸ‘¨β€πŸ’» Development

This project uses UV for building and publishing. Here's how to set up your development environment:

  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install UV:
pip install uv
  1. Install in development mode with test dependencies:
pip install -e ".[dev]"
  1. Build the package:
uv build --sdist --wheel
  1. Create a new release:
./release.sh

The release script will:

  • Build the package with UV
  • Create and push a git tag
  • Create a GitHub release
  • Update the Homebrew formula
  • Publish to PyPI (optional)

That's it! The package handles all other dependencies automatically, including NLTK data.

πŸ§ͺ Testing

Run the test suite:

pytest tests

Want to contribute? Great! Feel free to:

  • Open an Issue
  • Submit a Pull Request

⚠️ Common Exceptions

too many requests

If you're unauthenticated, reddit has relatively low rate limits for it's API. Either authenticate against your account, or just wait a sec and try again.

the page you requested does not exist

Simply a 404, which means that the provided username does not point to a valid page.

Pro Tip: Always review changes before executing deletions!

🌐 Support & Community

Join our subreddit: r/reddacted

About

reddacted lets you analyze & sanitize your online footprint using LLMs, PII detection & sentiment analysis to identify anything that might reveal personal info you may not want correlated with your anonymous profile

Topics

Resources

License

Stars

Watchers

Forks

Packages