EZCut - Video Processing Pipeline

A powerful video processing tool that generates transcripts with timestamps and keyframes with AI-generated descriptions.

Features

Automatic Speech Recognition: Extract transcripts with precise timestamps using OpenAI Whisper
Keyframe Extraction: Extract frames at regular intervals from videos
AI-Powered Image Descriptions: Generate detailed descriptions of keyframes using OpenAI's GPT-4 Vision
Batch Processing: Process multiple videos at once
Flexible Output Format: Generate structured output with timestamps

Output Format

The tool generates output in this format:

[transcript:00:00:15] Hello, welcome to our video presentation
[keyframe:00:00:30] A professional speaker standing at a podium in a modern conference room with a large screen displaying charts
[transcript:00:00:45] Today we'll be discussing the latest trends in technology
[keyframe:00:01:00] Close-up shot of hands typing on a laptop keyboard with code visible on the screen

Installation

Install Python dependencies:
```
pip install -r requirements.txt
```
Install FFmpeg (required for audio extraction):
- macOS: brew install ffmpeg
- Ubuntu/Debian: sudo apt install ffmpeg
- Windows: Download from https://ffmpeg.org/download.html

Set up OpenAI API Key:

cp env.example .env
# Edit .env and add your OpenAI API key

Usage

Quick Start - Process all videos in videos/ folder:

python process_videos.py

Process a single video:

python video_processor.py videos/your_video.mp4 -o outputs/output.txt

Process all videos in the videos directory:

python video_processor.py videos/ -o outputs/

Advanced options:

# Custom keyframe interval (every 60 seconds)
python video_processor.py video.mp4 -o output.txt -i 60

# Custom image description prompt
python video_processor.py video.mp4 -o output.txt -p "Describe what products are visible in this frame"

# Use API key directly
python video_processor.py video.mp4 -o output.txt --api-key sk-your-key-here

Configuration Options

-i, --interval: Keyframe extraction interval in seconds (default: 30)
-p, --prompt: Custom prompt for AI image descriptions
--api-key: OpenAI API key (alternatively set OPENAI_API_KEY environment variable)

Supported Video Formats

MP4, AVI, MOV, MKV, WMV, FLV, WebM

Requirements

Python 3.8+
OpenAI API key with GPT-4 Vision access
FFmpeg for audio processing
Sufficient disk space for temporary audio files

Cost Considerations

Whisper: Free (runs locally)
GPT-4 Vision: ~$0.01-0.03 per image depending on detail level
For a 10-minute video with 30-second keyframe intervals, expect ~20 API calls

Troubleshooting

Common Issues:

"FFmpeg not found": Install FFmpeg and ensure it's in your PATH
"OpenAI API key missing": Set up your API key in .env file or use --api-key
"Out of memory": Use smaller Whisper model in code (change to 'tiny' or 'base')
"GPU errors": The tool works on CPU, no GPU required

Performance Tips:

Use base Whisper model for good balance of speed/accuracy
Increase keyframe interval for longer videos to reduce API costs
Process videos in smaller batches if you have many files

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
api		api
api_outputs		api_outputs
assets		assets
client		client
job_data		job_data
nlp		nlp
nlp_outputs		nlp_outputs
nlpv2		nlpv2
output_segments		output_segments
output_segments_v2		output_segments_v2
output_segments_v3		output_segments_v3
outputs		outputs
processed_videos_clip		processed_videos_clip
stream_processed		stream_processed
stream_processed_clip		stream_processed_clip
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
assemble_vlog.py		assemble_vlog.py
changelog.md		changelog.md
combine_txt_files.py		combine_txt_files.py
combined_output.txt		combined_output.txt
cut_and_concat_videos.py		cut_and_concat_videos.py
cut_video_segments.py		cut_video_segments.py
demo_generate_intervals.py		demo_generate_intervals.py
example_usage.py		example_usage.py
generate_narrative_intervals.py		generate_narrative_intervals.py
hackathon_vlog_storyline.md		hackathon_vlog_storyline.md
index.html		index.html
process_videos.py		process_videos.py
process_videos.sh		process_videos.sh
requirements.txt		requirements.txt
stream_processed_clip_intervals.json		stream_processed_clip_intervals.json
stream_processed_clip_intervals_v2.json		stream_processed_clip_intervals_v2.json
stream_processed_clip_intervals_v3.json		stream_processed_clip_intervals_v3.json
stream_processed_clip_intervals_v4.json		stream_processed_clip_intervals_v4.json
stream_processed_clip_intervals_v5.json		stream_processed_clip_intervals_v5.json
stream_processed_clip_intervals_v6.json		stream_processed_clip_intervals_v6.json
video_processor.py		video_processor.py
vlog_assembler.py		vlog_assembler.py
vlog_script.txt		vlog_script.txt
vlog_story.txt		vlog_story.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EZCut - Video Processing Pipeline

Features

Output Format

Installation

Usage

Quick Start - Process all videos in videos/ folder:

Process a single video:

Process all videos in the videos directory:

Advanced options:

Configuration Options

Supported Video Formats

Requirements

Cost Considerations

Troubleshooting

Common Issues:

Performance Tips:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EZCut - Video Processing Pipeline

Features

Output Format

Installation

Usage

Quick Start - Process all videos in videos/ folder:

Process a single video:

Process all videos in the videos directory:

Advanced options:

Configuration Options

Supported Video Formats

Requirements

Cost Considerations

Troubleshooting

Common Issues:

Performance Tips:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages