Skip to content

ariaxhan/universal-spectroscopy-engine

Repository files navigation

Universal Spectroscopy Engine (USE)

A framework treating LLM activations as light spectra to measure semantic drift and hallucinations.

Overview

The Universal Spectroscopy Engine (USE) is a spectroscopy-inspired framework that treats Large Language Model (LLM) activations as physical light spectra. This novel approach enables the diagnosis of semantic drift, hallucinations, and model blindness locally, providing interpretable insights into model behavior.

Following the "Physics of Meaning" metaphor:

  • Light Source → User Input / Prompt
  • Material → The LLM (e.g., Llama-3-8B)
  • Prism → Sparse Autoencoder (SAE)
  • Spectrum → Feature Activations (Indices & Magnitudes)
  • Spectral Lines → Monosemantic Features (Concepts)
  • Thermal Noise → Polysemantic/Dense Activations

Core Hypotheses

  • H1: Spectral Purity - Hallucinations manifest as low spectral purity (high entropy/noise, few distinct peaks)
  • H2: Doppler Shift - Semantic meaning "redshifts" (generalizes) or "blueshifts" (distorts) through agent chains
  • H3: Absorption - Missing features indicate ignored instructions (model blindness)

Installation

Prerequisites

  • Python 3.10 or higher
  • At least 16GB RAM (32GB recommended for larger models)
  • NVIDIA GPU with CUDA support OR Apple Silicon (M1/M2/M3) for MPS acceleration

Step 1: Clone and Install Dependencies

# Navigate to project directory
cd universal-spectroscopy-engine

# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install the package in editable mode (includes all dependencies)
pip install -e .

Step 2: Verify Installation

import use
print(f"Universal Spectroscopy Engine v{use.__version__}")

# Check device availability
from use.utils import get_device
device = get_device()
print(f"Using device: {device}")

Getting Started

Step 1: Load Model and SAE

The USE uses:

  1. Transformer models via transformer_lens (e.g., Gemma-2-2B)
  2. Sparse Autoencoders via sae_lens (e.g., Gemma-Scope)

Both are downloaded automatically on first use.

Step 2: Basic Usage

from use import UniversalSpectroscopyEngine

# Initialize engine
engine = UniversalSpectroscopyEngine()

# Load model (downloads automatically on first use)
print("Loading model...")
engine.load_model("gemma-2-2b")

# Load SAE from Gemma-Scope (downloads automatically)
print("Loading SAE...")
engine.load_sae(
    model_name="gemma-2-2b",
    layer=5,
    release="gemma-scope-2b-pt-res-canonical",
    sae_id="layer_5/width_16k/canonical"
)

# Or use auto-detection (simpler)
engine.load_sae("gemma-2-2b", layer=5)

# Process input text
print("Processing input...")
input_text = "The cat sat on the mat."
spectrum = engine.process(input_text)

print(f"Spectrum: {spectrum}")
print(f"Number of active features: {len(spectrum)}")
print(f"Top 10 features: {spectrum.get_top_features(k=10)}")

Step 3: Test Hypotheses

# H1: Calculate Spectral Purity (Hallucination Detection)
purity = engine.calculate_purity(spectrum)
print(f"Spectral Purity: {purity:.4f}")
if purity < 0.3:
    print("⚠️  Warning: Low spectral purity - possible hallucination")

# H2: Calculate Semantic Drift (Compare two spectra)
input_spec = engine.process("The cat sat on the mat.")
output_spec = engine.process("The feline rested on the rug.")
drift = engine.calculate_drift(input_spec, output_spec)
print(f"Semantic Drift: {drift:.4f}")
if drift > 0.5:
    print("⚠️  Significant semantic drift detected")

# H3: Detect Absorption (Model Blindness)
absorbed = engine.detect_absorption(input_spec, output_spec)
if absorbed:
    print(f"⚠️  Model ignored {len(absorbed)} features: {absorbed[:10]}...")

# Cleanup
engine.cleanup()

Step 4: Advanced Usage with Context Manager

from use import UniversalSpectroscopyEngine

# Use context manager for automatic cleanup
with UniversalSpectroscopyEngine() as engine:
    engine.load_model("llama-3-8b")
    engine.load_sae("llama-3-8b", layer=5)
    
    spectrum = engine.process("Your text here")
    purity = engine.calculate_purity(spectrum)
    
    # Cleanup happens automatically

Project Structure

universal-spectroscopy-engine/
├── src/use/              # Main package source code
│   ├── __init__.py       # Package exports
│   ├── engine.py         # UniversalSpectroscopyEngine (main class)
│   ├── excitation.py     # ExcitationController (The Slit)
│   ├── sae_adapter.py    # SAE_Adapter (The Prism)
│   ├── interference.py   # InterferenceEngine (The Detector)
│   ├── spectrum.py       # Spectrum data class
│   └── utils.py          # Device detection, helpers
├── tests/                # Unit and integration tests
├── notebooks/            # Jupyter notebooks for experiments
├── .cursor/rules/        # Spectroscope system configuration
├── pyproject.toml        # Project configuration and dependencies
├── Dockerfile            # Docker configuration
├── docker-compose.yml    # Docker Compose configuration
└── README.md             # This file

Components

UniversalSpectroscopyEngine

Main orchestration class that coordinates all components.

ExcitationController (The Slit)

Manages input formatting and feature steering:

  • process(): Extract activations from model layers
  • monochromatic_steering(): Force specific features
  • pulse_train(): Inject noise for robustness testing

SAE_Adapter (The Prism)

Loads SAEs and normalizes outputs:

  • load_sae(): Load SAE for model and layer
  • decompose(): Convert activations to Spectrum objects
  • normalize(): Standardize spectrum format

InterferenceEngine (The Detector)

Mathematical analysis module:

  • calculate_purity(): H1 - Spectral purity for hallucination detection
  • calculate_drift(): H2 - Semantic drift measurement
  • detect_absorption(): H3 - Missing features detection

Spectrum

Data class representing feature activations:

  • wavelengths: Feature indices (which features are active)
  • intensities: Activation magnitudes (how strongly)
  • get_top_features(): Get top k features by intensity
  • to_spec_dict(): Serialize to .spec format

Dependencies

  • transformer_lens: For working with transformer models
  • sparse_autoencoder: Sparse Autoencoder utilities
  • torch: PyTorch for deep learning
  • matplotlib: Visualization library
  • plotly: Interactive plotting

Troubleshooting

Device Issues

Problem: CUDA not available on Mac Solution: USE automatically detects MPS (Apple Silicon). If you see CUDA errors, ensure you're using the latest PyTorch with MPS support.

Problem: Out of memory errors Solution:

  • Use smaller models (e.g., llama-3-8b instead of larger variants)
  • Implement model offloading
  • Use engine.cleanup() or context managers

SAE Loading Issues

Problem: SAE not found or download fails Solution:

  • Ensure you have internet connection (SAEs download from Hugging Face on first use)
  • Check model name and layer number are correct
  • For Gemma-2-2B, use layers 0-25
  • Verify sae_lens is installed: pip install sae-lens

Import Errors

Problem: ModuleNotFoundError for transformer_lens or sae_lens Solution:

# Reinstall package with dependencies
pip install -e .

Next Steps

  1. Run Example Experiments: See experiments/ directory for biopsy and other experiments
  2. Test Hypotheses: Test H1 (Spectral Purity), H2 (Semantic Drift), H3 (Absorption)
  3. Create Visualizations: Implement spectral barcode rendering
  4. Explore Different Layers: Try different layers (0-25 for Gemma-2-2B)
  5. Add Tests: Write unit tests for each component
  6. Performance Tuning: Optimize for your hardware

Development

Running Tests

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

Code Style

# Format code
black src/

# Lint code
flake8 src/

License

MIT

About

Spectroscopy inspired engine that treats LLM activations as physical light spectra to diagnose semantic drift, hallucinations, and model blindness locally

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors