Induction Head Detector

Find and analyze induction heads in any transformer model.

Induction heads implement a key copying mechanism: when they see [A][B]...[A], they predict [B] by attending back to what followed the previous occurrence of [A].

Installation

pip install -r requirements.txt

Quick Start

from induction_head_detector import (
    InductionHeadDetector,
    detect_induction_heads,
    PatternGenerator,
    InductionHeadAnalyzer,
)

# Detect induction heads in your model
result = detect_induction_heads(model, threshold=0.4)

print(f"Found {result.num_heads_detected} induction heads")
for layer, head in result.induction_heads:
    print(f"  Layer {layer}, Head {head}")

# Detailed analysis
detector = InductionHeadDetector(model, threshold=0.3)
result = detector.detect(sequence_length=64)

for score in result.get_top_heads(n=5):
    print(f"L{score.layer}H{score.head}: {score.overall_score:.3f}")

Detection Metrics

The detector uses three complementary metrics:

1. Induction Score

Measures attention to the position after previous token matches:

When tokens[i-1] == tokens[j], measure attention from i to j+1
High scores indicate pattern: "I've seen A before, attend to what followed"

2. Prefix Matching Score

Compares attention when prefixes match vs. don't match:

If current position's prefix matches a previous prefix, is there more attention?
Captures the "fuzzy matching" aspect of induction

3. Copying Score

Measures whether attention leads to correct predictions:

Where does max attention point?
Does the next token after that position match what we're predicting?

Pattern Generation

Generate test sequences designed to elicit induction behavior:

from induction_head_detector import PatternGenerator

gen = PatternGenerator(vocab_size=50, device="cuda")

# Repeated sequence: [A,B,C,D,A,B,C,D,A,B,C,D]
pattern = gen.generate("repeated", seq_len=64, n_repeats=4)

# ABAB pattern: [A,B,A,B,A,B,...]
pattern = gen.generate("abab", seq_len=64)

# Copy task: [content][SEP][content]
pattern = gen.generate("copy", seq_len=64)

# Generate all pattern types
all_patterns = gen.generate_all(seq_len=64)

Attention Analysis

Analyze attention patterns in detail:

from induction_head_detector import InductionHeadAnalyzer, analyze_head_attention

# Full model analysis
analyzer = InductionHeadAnalyzer(model)
induction_heads = analyzer.find_induction_heads(threshold=0.4)

# Analyze specific attention matrix
behavior = analyze_head_attention(
    attention_pattern,  # [seq_len, seq_len]
    tokens,             # [seq_len]
    period=16,          # For repeated sequences
)

print(f"Diagonal score: {behavior.diagonal_score:.3f}")
print(f"Induction stripe: {behavior.induction_stripe_score:.3f}")

Visualization

ASCII visualizations work in any terminal:

from induction_head_detector import (
    plot_attention_pattern,
    plot_head_scores,
    plot_induction_stripe,
    create_analysis_report,
)

# Visualize attention pattern
print(plot_attention_pattern(attention))

# Score comparison
print(plot_head_scores([s.to_dict() for s in result.head_scores]))

# Induction stripe analysis
print(plot_induction_stripe(attention, period=8))

# Full report
print(create_analysis_report(result, detailed=True))

Example Output

INDUCTION HEAD ANALYSIS REPORT
==============================

Model Info:
  n_layers: 12

Detection Summary:
  Threshold: 0.40
  Heads detected: 4

Top Heads by Score:
  L05H07: 0.823  [########--]
  L06H03: 0.756  [########--]
  L05H02: 0.698  [#######---]
  L07H01: 0.612  [######----]

Understanding Induction Heads

Induction heads are a crucial circuit in transformers that enable in-context learning. They work in two stages:

Previous Token Head (often in early layers): Creates Q/K composition where the key is the previous token
Induction Head (middle layers): Uses this to implement [A][B]...[A] -> [B]

Key papers:

Testing

pytest tests/ -v

56 tests covering:

Detection algorithms
Pattern generation
Attention analysis
Visualization

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
experiments		experiments
induction_head_detector		induction_head_detector
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Induction Head Detector

Installation

Quick Start

Detection Metrics

1. Induction Score

2. Prefix Matching Score

3. Copying Score

Pattern Generation

Attention Analysis

Visualization

Example Output

Understanding Induction Heads

Testing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Induction Head Detector

Installation

Quick Start

Detection Metrics

1. Induction Score

2. Prefix Matching Score

3. Copying Score

Pattern Generation

Attention Analysis

Visualization

Example Output

Understanding Induction Heads

Testing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages