An AI-powered image analysis and metadata management tool that uses state-of-the-art machine learning models to analyze images and generate rich, structured metadata.
CURRENTLY IN ALPHA. USE AT YOUR OWN RISK.
- Version: 0.1.1
- Release Date: 2024-12-25
- Status: Development
- πΌοΈ Advanced image analysis using Google's Gemini Vision API and Anthropic Claude
- π Rich, structured metadata generation with AI-powered descriptions
- π Batch processing with smart compression and parallel processing
- πΎ Multiple output formats (CSV, XML) with customizable schemas
- π·οΈ Automatic EXIF metadata writing and management
- π AI-powered filename suggestions and organization
- π Complete file operation tracking with detailed logs
- π Non-destructive processing with backup options
- π Progress tracking and detailed statistics
- βοΈ Highly configurable via environment variables and CLI
-
Ensure you have Python 3.8 or higher installed
-
Clone the repository:
git clone https://github.com/nerveband/image_sense.git
cd image_sense- Install dependencies:
pip install -r requirements.txt- Install the package in development mode:
pip install -e .- Copy the example environment file and configure your settings:
cp .env.example .env- Edit
.envwith your API keys and preferences
Image Sense can be configured using environment variables. Create a .env file in the project root with the following options:
Below are the default values used by the application. You can override any of these in your .env file:
# Enable smart compression (recommended for large files)
COMPRESSION_ENABLED=true
# JPEG quality (1-100, higher = better quality but larger size)
COMPRESSION_QUALITY=85
# Maximum dimension in pixels for processing
MAX_DIMENSION=1024# Number of images to process in parallel
DEFAULT_BATCH_SIZE=8
# Maximum allowed batch size (model-dependent)
MAX_BATCH_SIZE=16# Default output format (csv or xml)
DEFAULT_OUTPUT_FORMAT=csv
# Directory for output files
OUTPUT_DIRECTORY=output# Default AI model
DEFAULT_MODEL=gemini-2.0-flash-exp
# Available models:
# - gemini-2.0-flash-exp: Latest experimental model (fastest)
# - gemini-1.5-flash: Production model (balanced)
# - gemini-1.5-pro: More detailed analysis (slower)# Create backups before modifying metadata
BACKUP_METADATA=true
# Write analysis results to image EXIF data
WRITE_EXIF=true
# Create duplicate files before modifying
DUPLICATE_FILES=false
# Suffix for duplicate files
DUPLICATE_SUFFIX=_modified# Show progress bars and statistics
SHOW_PROGRESS=true
# Show real-time Gemini model responses
VERBOSE_OUTPUT=false
# Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
LOG_LEVEL=INFO-
Improved Verbose Output (Now Default)
- Verbose output is now enabled by default for better visibility
- Added
--verboseoffflag to disable verbose output if needed - Added detailed progress indicators for:
- Image optimization/compression
- Gemini API interactions
- XML parsing and validation
- CSV output generation
-
Enhanced Image Processing
- Added automatic image optimization for Gemini API
- Shows compression statistics (original size, compressed size, reduction percentage)
- Better error handling for image processing failures
-
Improved Data Handling
- Better XML parsing with proper Unicode support
- Enhanced CSV output with all fields properly populated
- Added suggested filename to output
- Fixed various edge cases in data extraction
-
Configuration Updates
- Environment variables are now properly respected (VERBOSE_OUTPUT, GOOGLE_API_KEY, etc.)
- Verbose output can be controlled via:
- Environment variable:
VERBOSE_OUTPUT=false - CLI flag:
--verboseoff - Default is verbose on
- Environment variable:
Process a single image with default settings (verbose):
image_sense process path/to/image.jpgProcess a single image with verbose output disabled:
image_sense process path/to/image.jpg --verboseoffBulk process a directory of images:
image_sense bulk-process path/to/directoryBulk process with specific options:
image_sense bulk-process path/to/directory --recursive --verboseoff --model 2-flashVERBOSE_OUTPUT: Control verbose output (default: "true")GOOGLE_API_KEY: Your Google API key for GeminiGEMINI_MODEL: Default model to use (default: "gemini-2.0-flash-exp")
You'll need a Google API key with Gemini Vision API access enabled:
- Get it from: https://aistudio.google.com/app/apikey
- Add it to your
.envfile asGOOGLE_API_KEY=your-key-here - Or pass it directly using the
--api-keyparameter
- Generate metadata for a directory of images:
image_sense generate-metadata path/to/photos --api-key YOUR_API_KEYThis will analyze all images and create a metadata.csv file with detailed descriptions, keywords, and technical details.
- Process a single image:
image_sense process path/to/image.jpg- Process multiple images with advanced options:
image_sense bulk-process path/to/directory --api-key YOUR_API_KEY --output-format xmlThe generate-metadata command analyzes images and creates structured metadata files:
image_sense generate-metadata path/to/directory --api-key YOUR_API_KEY [OPTIONS]Key features:
- Non-destructive: Original images remain unchanged
- Flexible output: Choose between CSV and XML formats
- Smart compression: Optimized for faster processing
- Batch processing: Handle multiple images efficiently
- Incremental updates: Skip already processed files
- AI-powered filename suggestions
- Complete file operation tracking
Options:
--output-format, -f: Choose output format (csv/xml)--output-file: Specify custom output file path--model: Select AI model to use--batch-size: Set custom batch size--no-compress: Disable image compression--skip-existing: Skip files that already have metadata--duplicate: Create duplicates before modifying files--no-backup: Disable ExifTool backup creation
Example with duplicate files:
# Process images and create duplicates before modification
image_sense generate-metadata photos/ --api-key YOUR_API_KEY --duplicate
# Process without creating duplicates (modify in place)
image_sense generate-metadata photos/ --api-key YOUR_API_KEYFor detailed command documentation, see Commands Documentation.
The CSV output includes columns for:
- Original file path
- Original filename
- New filename (if renamed)
- Modified file path (if duplicated)
- Suggested filename
- Description
- Keywords
- Technical details
- Visual elements
- Composition
- Mood
- Use cases
The XML output provides a structured representation of:
- File information
- Original path and filename
- New filename (if renamed)
- Modified path (if duplicated)
- Suggested filename
- Image metadata
- Analysis results
- Technical information
- Visual characteristics
XML output is now saved by default for each processed folder, with the following features:
- Automatic XML file creation named after the input folder
- Original filename tracking in XML output
- Configurable via
SAVE_XML_OUTPUTenvironment variable - XML files contain complete analysis with original file tracking
To disable XML output, set in your .env:
SAVE_XML_OUTPUT=false- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Google Gemini Vision API for image analysis
- ExifTool for metadata management
- Rich for beautiful terminal output

