Automatically detect Sex & Nudity content in movies using GPT-4o vision analysis. Uses IMDB-style categories to identify content for filtering.
Sex (any sexual nature):
- Kissing (passionate/romantic kissing, making out)
- Groping/touching (intimate touching, caressing)
- Sexual activity (intercourse, implied or explicit)
- Suggestive behavior (seductive poses, sensual dancing)
Nudity:
- Revealing clothing (low-cut tops, swimwear, lingerie)
- Partial nudity (exposed back, cleavage, shirtless in sexual context)
- Full nudity (breasts, buttocks, genitals)
- Scene Detection - Uses PySceneDetect to find natural scene boundaries
- Frame Extraction - Extracts 8 evenly-spaced frames from each scene
- Vision Analysis - Sends frames to GPT-4o for NSFW content detection
- Segment Output - Generates JSON with timestamps of flagged scenes
ummah/
├── process_movie.py # Main CLI entry point
├── src/
│ ├── __init__.py
│ ├── config.py # Configuration and .env loading
│ ├── utils.py # Time formatting helpers
│ ├── scene_detector.py # PySceneDetect wrapper
│ ├── frame_extractor.py# Video frame extraction
│ ├── vision_analyzer.py# GPT-4o vision API
│ └── aggregator.py # Segment merging
├── requirements.txt
├── env_example.txt # Example .env file
└── README.md
cd ummah
# Install dependencies
pip install -r requirements.txtCreate a .env file with your OpenAI API key:
# Copy the example file
cp env_example.txt .env
# Edit .env and add your key
OPENAI_API_KEY=your-api-key-hereOr set it as an environment variable:
export OPENAI_API_KEY="your-api-key-here"python process_movie.py movie.mp4Creates segments.json with detected NSFW segments.
python process_movie.py movie.mp4 -o my_segments.jsonLower threshold = more sensitive (catches more, may have false positives):
python process_movie.py movie.mp4 --threshold 0.4Higher threshold = less sensitive (fewer false positives, may miss some):
python process_movie.py movie.mp4 --threshold 0.8python process_movie.py movie.mp4 \
-o segments.json \
--threshold 0.6 \
--frames-per-scene 8 \
--min-scene-length 2.0 \
--max-workers 4 \
--frame-width 512| Option | Default | Description |
|---|---|---|
-o, --output |
segments.json | Output JSON file path |
--threshold |
0.6 | NSFW score threshold (0.0-1.0) |
--frames-per-scene |
8 | Frames to extract per scene |
--min-scene-length |
2.0 | Minimum scene length in seconds |
--max-workers |
4 | Parallel API requests |
--frame-width |
512 | Resize frame width (pixels) |
[
{
"start": "00:12:45",
"end": "00:13:22",
"severity": "moderate",
"reason": "passionate kissing, revealing clothing",
"confidence": 0.85
}
]| Severity | Score Range | Examples |
|---|---|---|
| none | 0.0-0.2 | Casual clothing, brief peck |
| mild | 0.3-0.5 | Passionate kissing, revealing clothing, shirtless scenes |
| moderate | 0.6-0.7 | Making out, lingerie/underwear, partial nudity |
| severe | 0.8-1.0 | Full nudity, sexual activity, explicit content |
Approximate costs per movie (GPT-4o):
- Short film (30 min): ~$0.15-0.25
- Standard movie (2 hr): ~$0.35-0.60
- Long movie (3 hr): ~$0.60-1.00
Reduce false positives: Increase --threshold to 0.7-0.8
Improve accuracy: Use --frames-per-scene 12 and --threshold 0.5
Speed up processing: Use --max-workers 8 and --frame-width 384
- Python 3.10+
- OpenAI API key with GPT-4o access
- FFmpeg (for video processing via OpenCV)