Multi-Backend Video Generator

A Python application that converts images to videos using multiple AI providers: OpenAI's Sora-2, Azure AI Foundry Sora, Google's Veo-3, and RunwayML's Gen-4 models.

Features

🎨 Multiple AI Backends - Choose between OpenAI Sora-2, Azure Sora, Google Veo-3, or RunwayML
🖼️ Flexible Image Input - Single files, multiple images, wildcard patterns, or text-only
🔗 Seamless Stitching - Multi-clip video generation with automatic frame transitions (Veo 3.1, RunwayML Veo)
🔄 Automatic Retries - Exponential backoff when APIs are at capacity
📝 Comprehensive Logging - DEBUG-level logging to logs/video_gen.log
🏗️ Modular Architecture - Clean, maintainable, and extensible codebase

Quick Start

# Install dependencies
pip install -r requirements.txt

# Set your API key
export OPENAI_API_KEY="your-key"      # For OpenAI Sora
export RUNWAY_API_KEY="your-key"      # For RunwayML
# For Google Veo - see authentication guide

# Generate a video
./image2video.py "A peaceful sunset over mountains"

# With images
./image2video.py -i "photo.jpg" "Animate this scene"

# Choose a provider
./image2video.py --provider runway "Your prompt"

Documentation

📚 Complete Documentation - Full documentation index

Quick Links

Quick Start Guide - Get started in 5 minutes
User Guide - Complete usage documentation
Installation - Detailed setup instructions

Provider Guides

OpenAI Sora - OpenAI Sora-2 setup
Azure Sora - Azure AI Foundry setup
Google Veo - Google Veo-3 with OAuth
RunwayML - RunwayML Gen-4 & Veo

Advanced Topics

Stitching Guide - Multi-clip video generation
Image Grouping - Control which images are used per clip
Prompt Engineering - Writing effective prompts
Troubleshooting - Common issues

Supported Backends

Backend	Models	Pricing	Multi-Image	Stitching
OpenAI Sora	sora-2, sora-2-pro	Variable	✅	❌
Azure Sora	sora-2, sora-2-pro	$0.10/sec	✅	❌
Google Veo	veo-3.0, veo-3.1	$0.15-0.40	✅	✅
RunwayML	gen4, gen4_turbo, google.x	Variable	Single only	✅ (Veo)

See Provider Comparison for detailed feature matrix.

Usage Examples

Basic Text-to-Video

./image2video.py "A serene lake at dawn with mist rising"

Image-to-Video

./image2video.py -i "landscape.jpg" "Time-lapse of this scene at sunset"

Multiple Images

./image2video.py -i "img1.jpg,img2.jpg,img3.jpg" "Tour of these locations"

Wildcard Patterns

./image2video.py -i "photos/*.jpg" "Create a walkthrough video"

Provider Selection

# Use Google Veo
./image2video.py --provider google --model veo-3.1-fast-generate-preview "Your prompt"

# Use RunwayML
./image2video.py --provider runway --model gen4 "Your prompt"

# Use Azure Sora
./image2video.py --provider azure "Your prompt"

Seamless Multi-Clip Stitching (Veo 3.1)

./image2video.py --provider google --model veo-3.1-fast-generate-preview --stitch \\
  -i reference_images/*.jpg \\
  -p "Camera pans across the foyer" \\
     "Dolly forward into the living room" \\
     "Pan right to show the kitchen"

💡 Tip: Control which images are used for each clip - see Image Grouping Guide

Installation

Requirements

Python 3.8 or higher
ffmpeg (for video processing)

Setup

# Clone or download the repository
git clone <repository-url>
cd image_to_video

# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\\Scripts\\activate

# Install dependencies
pip install -r requirements.txt

# Configure API keys
cp .env.example .env
# Edit .env with your API keys

See Installation Guide for detailed instructions.

Authentication

Each provider requires different authentication:

Tip: A fully commented template of all required and optional variables is provided in .env.sample. Copy it to .env and edit values as needed.

OpenAI Sora

export OPENAI_API_KEY="your-api-key"

Azure Sora

export AZURE_OPENAI_API_KEY="your-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"

Google Veo

# Browser OAuth (easiest)
./image2video.py --provider google --google-login

# Or manual with gcloud
gcloud auth application-default login
export GOOGLE_API_KEY="\$(gcloud auth application-default print-access-token)"
export GOOGLE_CLOUD_PROJECT="your-project-id"

RunwayML

export RUNWAY_API_KEY="your-api-key"

See Authentication Guide for complete details.

Project Structure

image_to_video/
├── image2video.py              # Main CLI entry point
├── image2video_mono.py         # Legacy monolithic script (Sora-2 only)
├── requirements.txt            # Python dependencies
├── README.md                   # This file
│
├── docs/                       # Documentation
│   ├── README.md              # Documentation index
│   ├── quick-start.md         # Quick start guide
│   ├── user-guide.md          # Complete user guide
│   ├── providers/              # Backend-specific docs
│   ├── advanced/              # Advanced topics
│   ├── technical/             # Technical documentation
│   └── reference/             # Reference materials
│
└── video_gen/                  # Core package
    ├── config.py              # Configuration management
    ├── file_handler.py        # File operations
    ├── arg_parser.py          # Argument parsing
    ├── video_generator.py     # Main orchestration
    ├── logger.py              # Logging infrastructure
    └── providers/             # Provider implementations
        ├── openai_provider/   # OpenAI Sora
        ├── azure_provider/    # Azure Sora
        ├── google_provider/   # Google Veo
        └── runway_provider/   # RunwayML Gen-4 & Veo

Development

Running Tests

# Activate virtual environment
source venv/bin/activate

# Run the full unittest suite
python -m unittest discover -s tests -p "test_*.py" -v

# Test specific provider
./image2video.py --provider openai "Test prompt"

Contributing

We welcome contributions! See the Development Guide for:

Setting up a development environment
Code style and conventions
Testing guidelines
Submitting pull requests

Troubleshooting

Common issues and solutions:

"No images provided" error

# Use -i flag before image paths
./image2video.py -i "images/*.jpg" "Your prompt"

API key not found

# Verify environment variables are set
echo \$OPENAI_API_KEY
echo \$RUNWAY_API_KEY

Google Veo authentication issues

# Use browser OAuth (easiest method)
./image2video.py --provider google --google-login

See Troubleshooting Guide for complete solutions.

Architecture

The application uses a modular, provider-based architecture:

Providers - Backend-specific implementations (OpenAI, Azure, Google, RunwayML)
Clients - Separate client classes per model family within each provider
Configuration - Per-provider config classes with validation
Orchestration - Central dispatcher routes requests to appropriate providers
Logging - Centralized logging infrastructure with DEBUG-level detail

See Architecture Guide for detailed design documentation.

License

This project is provided as-is for educational and research purposes.

Links

📖 Full Documentation
🚀 Quick Start
📚 User Guide
🔧 API Reference
🐛 Troubleshooting

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
docs		docs
logs		logs
tests		tests
video_gen		video_gen
.env.sample		.env.sample
.gitignore		.gitignore
README.md		README.md
image2video.py		image2video.py
requirements.txt		requirements.txt
test_final.py		test_final.py
videotransformer.py		videotransformer.py
walkthrough_and_stitch.sh		walkthrough_and_stitch.sh

Folders and files

Latest commit

History

Repository files navigation

Multi-Backend Video Generator

Features

Quick Start

Documentation

Quick Links

Provider Guides

Advanced Topics

Supported Backends

Usage Examples

Basic Text-to-Video

Image-to-Video

Multiple Images

Wildcard Patterns

Provider Selection

Seamless Multi-Clip Stitching (Veo 3.1)

Installation

Requirements

Setup

Authentication

OpenAI Sora

Azure Sora

Google Veo

RunwayML

Project Structure

Development

Running Tests

Contributing

Troubleshooting

Architecture

License

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages