Image Sense v0.1.1

An AI-powered image analysis and metadata management tool that uses state-of-the-art machine learning models to analyze images and generate rich, structured metadata.

Status: Alpha Release

CURRENTLY IN ALPHA. USE AT YOUR OWN RISK.

Version: 0.1.1
Release Date: 2024-12-25
Status: Development

Features

🖼️ Advanced image analysis using Google's Gemini Vision API and Anthropic Claude
📝 Rich, structured metadata generation with AI-powered descriptions
🔄 Batch processing with smart compression and parallel processing
💾 Multiple output formats (CSV, XML) with customizable schemas
🏷️ Automatic EXIF metadata writing and management
📊 AI-powered filename suggestions and organization
📋 Complete file operation tracking with detailed logs
🔒 Non-destructive processing with backup options
📊 Progress tracking and detailed statistics
⚙️ Highly configurable via environment variables and CLI

Installation

Ensure you have Python 3.8 or higher installed
Clone the repository:

git clone https://github.com/nerveband/image_sense.git
cd image_sense

Install dependencies:

pip install -r requirements.txt

Install the package in development mode:

pip install -e .

Copy the example environment file and configure your settings:

cp .env.example .env

Edit .env with your API keys and preferences

Configuration

Image Sense can be configured using environment variables. Create a .env file in the project root with the following options:

Default Values and Configuration

Below are the default values used by the application. You can override any of these in your .env file:

Image Processing

# Enable smart compression (recommended for large files)
COMPRESSION_ENABLED=true
# JPEG quality (1-100, higher = better quality but larger size)
COMPRESSION_QUALITY=85
# Maximum dimension in pixels for processing
MAX_DIMENSION=1024

Batch Processing

# Number of images to process in parallel
DEFAULT_BATCH_SIZE=8
# Maximum allowed batch size (model-dependent)
MAX_BATCH_SIZE=16

Output Settings

# Default output format (csv or xml)
DEFAULT_OUTPUT_FORMAT=csv
# Directory for output files
OUTPUT_DIRECTORY=output

Model Settings

# Default AI model
DEFAULT_MODEL=gemini-2.0-flash-exp
# Available models:
# - gemini-2.0-flash-exp: Latest experimental model (fastest)
# - gemini-1.5-flash: Production model (balanced)
# - gemini-1.5-pro: More detailed analysis (slower)

Metadata Settings

# Create backups before modifying metadata
BACKUP_METADATA=true
# Write analysis results to image EXIF data
WRITE_EXIF=true
# Create duplicate files before modifying
DUPLICATE_FILES=false
# Suffix for duplicate files
DUPLICATE_SUFFIX=_modified

Progress and Logging

# Show progress bars and statistics
SHOW_PROGRESS=true
# Show real-time Gemini model responses
VERBOSE_OUTPUT=false
# Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
LOG_LEVEL=INFO

Recent Updates

Enhanced Image Analysis and Feedback (December 25, 2023)

Improved Verbose Output (Now Default)
- Verbose output is now enabled by default for better visibility
- Added --verboseoff flag to disable verbose output if needed
- Added detailed progress indicators for:
  - Image optimization/compression
  - Gemini API interactions
  - XML parsing and validation
  - CSV output generation
Enhanced Image Processing
- Added automatic image optimization for Gemini API
- Shows compression statistics (original size, compressed size, reduction percentage)
- Better error handling for image processing failures
Improved Data Handling
- Better XML parsing with proper Unicode support
- Enhanced CSV output with all fields properly populated
- Added suggested filename to output
- Fixed various edge cases in data extraction
Configuration Updates
- Environment variables are now properly respected (VERBOSE_OUTPUT, GOOGLE_API_KEY, etc.)
- Verbose output can be controlled via:
  - Environment variable: VERBOSE_OUTPUT=false
  - CLI flag: --verboseoff
  - Default is verbose on

Usage Examples

Process a single image with default settings (verbose):

image_sense process path/to/image.jpg

Process a single image with verbose output disabled:

image_sense process path/to/image.jpg --verboseoff

Bulk process a directory of images:

image_sense bulk-process path/to/directory

Bulk process with specific options:

image_sense bulk-process path/to/directory --recursive --verboseoff --model 2-flash

Environment Variables

VERBOSE_OUTPUT: Control verbose output (default: "true")
GOOGLE_API_KEY: Your Google API key for Gemini
GEMINI_MODEL: Default model to use (default: "gemini-2.0-flash-exp")

API Keys

You'll need a Google API key with Gemini Vision API access enabled:

Get it from: https://aistudio.google.com/app/apikey
Add it to your .env file as GOOGLE_API_KEY=your-key-here
Or pass it directly using the --api-key parameter

Usage

Quick Start

Generate metadata for a directory of images:

image_sense generate-metadata path/to/photos --api-key YOUR_API_KEY

This will analyze all images and create a metadata.csv file with detailed descriptions, keywords, and technical details.

Process a single image:

image_sense process path/to/image.jpg

Process multiple images with advanced options:

image_sense bulk-process path/to/directory --api-key YOUR_API_KEY --output-format xml

Command Options

Generate Metadata (Recommended)

The generate-metadata command analyzes images and creates structured metadata files:

image_sense generate-metadata path/to/directory --api-key YOUR_API_KEY [OPTIONS]

Key features:

Non-destructive: Original images remain unchanged
Flexible output: Choose between CSV and XML formats
Smart compression: Optimized for faster processing
Batch processing: Handle multiple images efficiently
Incremental updates: Skip already processed files
AI-powered filename suggestions
Complete file operation tracking

Options:

--output-format, -f: Choose output format (csv/xml)
--output-file: Specify custom output file path
--model: Select AI model to use
--batch-size: Set custom batch size
--no-compress: Disable image compression
--skip-existing: Skip files that already have metadata
--duplicate: Create duplicates before modifying files
--no-backup: Disable ExifTool backup creation

Example with duplicate files:

# Process images and create duplicates before modification
image_sense generate-metadata photos/ --api-key YOUR_API_KEY --duplicate

# Process without creating duplicates (modify in place)
image_sense generate-metadata photos/ --api-key YOUR_API_KEY

For detailed command documentation, see Commands Documentation.

Output Formats

CSV Format

The CSV output includes columns for:

Original file path
Original filename
New filename (if renamed)
Modified file path (if duplicated)
Suggested filename
Description
Keywords
Technical details
Visual elements
Composition
Mood
Use cases

XML Format

The XML output provides a structured representation of:

File information
- Original path and filename
- New filename (if renamed)
- Modified path (if duplicated)
- Suggested filename
Image metadata
Analysis results
Technical information
Visual characteristics

XML output is now saved by default for each processed folder, with the following features:

Automatic XML file creation named after the input folder
Original filename tracking in XML output
Configurable via SAVE_XML_OUTPUT environment variable
XML files contain complete analysis with original file tracking

To disable XML output, set in your .env:

SAVE_XML_OUTPUT=false

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Google Gemini Vision API for image analysis
ExifTool for metadata management
Rich for beautiful terminal output

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
docs		docs
images		images
src		src
test_data		test_data
tests		tests
venv		venv
.coverage		.coverage
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py
test.jpg		test.jpg
test_image.jpg		test_image.jpg
test_image_metadata.csv		test_image_metadata.csv
test_metadata.csv		test_metadata.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Sense v0.1.1

Status: Alpha Release

Features

Installation

Configuration

Default Values and Configuration

Image Processing

Batch Processing

Output Settings

Model Settings

Metadata Settings

Progress and Logging

Recent Updates

Enhanced Image Analysis and Feedback (December 25, 2023)

Usage Examples

Environment Variables

API Keys

Usage

Quick Start

Command Options

Generate Metadata (Recommended)

Output Formats

CSV Format

XML Format

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Sense v0.1.1

Status: Alpha Release

Features

Installation

Configuration

Default Values and Configuration

Image Processing

Batch Processing

Output Settings

Model Settings

Metadata Settings

Progress and Logging

Recent Updates

Enhanced Image Analysis and Feedback (December 25, 2023)

Usage Examples

Environment Variables

API Keys

Usage

Quick Start

Command Options

Generate Metadata (Recommended)

Output Formats

CSV Format

XML Format

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages