Code, data, and paper sources for "The Canary in the Carry Chain: Transformers Know the Schedule Before They Can Execute" (NeurIPS 2026 submission).
When a transformer fails on an iterative algorithmic task, it has often failed to execute, not to schedule. We factor such tasks as
A class-balanced linear probe at encoder Layer 2 decodes
On a unified
A theorem in Section 4 shows the conditions under which a deterministic interface, which adds no Shannon information beyond the input, can still move the achievable frontier of a restricted executor class. A cross-seed Layer 2 MLP comparison shows the high-$A_{k=8}$ basin contains many distinct distributed solutions rather than one shared circuit.
- The Python and shell files at the repo root are the training, probing, evaluation, and circuit-analysis code. The most relevant entry points are
eci_suite.py,eci_placebos.py,circuit_basin.py,circuit_patch.py,probe_balanced.py,decoder_only.py, and theeci_phase_*.pyscripts. results_final/contains all raw experiment outputs, organized by experiment family. Seeresults_final/MANIFEST.mdfor the family-level index andresults_final/all_results.csvfor a consolidated table of headline metrics by (variant, seed).
Long Collatz step in base 32, seven oracle_aligned seeds at 1000 epochs, unified
| Seed | |||||
|---|---|---|---|---|---|
| main | 73.39% | 45.44% | 95.62% | 40.69% | 7 |
| 100 | 76.13% | 54.97% | 87.38% | 77.53% | 6 |
| 890 | 73.91% | 47.26% | 90.98% | 50.80% | 6 |
| 789 | 68.83% | 30.37% | 91.12% | 0.00% | 6 |
| 234 | 65.48% | 20.01% | 37.01% | 23.01% | 6 |
| 456 | 60.93% | 15.02% | 43.97% | 1.09% | 4 |
| 567 | 60.58% | 14.68% | 43.93% | 0.10% | 4 |
Mean
pip install torch numpy matplotlib tqdm
# Train the headline 3x+1 base-32 model
python run.py train --base 32 --dev cuda
# Train an ECI variant (one of: strong_baseline, null_slots, iid_marginal,
# shuffled, fixed_permutation, predicted_ss, oracle_aligned, oracle_both,
# eci_baseline, aux_only)
python eci_suite.py oracle_aligned --base 32 --epochs 1000 --out output_eci/oracle_aligned
# Run the per-neuron Layer 2 MLP causal contribution sweep
python circuit_basin.py output_seeds_1k_locks/oracle_aligned_s100 8 100
# Cross-seed activation patching
python circuit_patch.py output_seeds_1k_locks/oracle_aligned_s890 \
output_seeds_1k_locks/oracle_aligned_s567 8 200
# Class-balanced probe selectivity sweep
python probe_balanced.py --base 32 --ckpt output/b32/best.pt
# Decoder-only replication
python decoder_only.py --base 32 --epochs 300
# Regenerate every paper figure
for f in paper/make_*.py; do python "$f"; doneA 1000-epoch oracle_aligned run takes about 6 hours on a single H100. The full ECI sweep at 1000 epochs each takes about 30 H100-hours when run with multiple variants in parallel. The cross-seed circuit comparison takes about an hour per seed.
Local model checkpoints (*.pt files for the headline output_mps/b32, the decoder-only output_do, the controller-only output_ctrl, and one output_eci_seeds/baseline_s123 seed, totaling ~2 GB) are uploaded separately to Zenodo. The raw metrics.json and balanced_stats.json files needed to reproduce every paper number are in results_final/ and do not require the weights.
Total compute reported for the experiments in the paper is about 195 H100-equivalent hours, summarized in Appendix I (Table 9) of the paper.
- Charton, F. and Narayanan, A. (2025). Transformers know more than they can tell: Learning the Collatz sequence. arXiv:2511.10811
- Turner, A. et al. (2023). Activation addition: Steering language models without optimization. arXiv:2308.10248
- Nanda, N. et al. (2023). Progress measures for grokking via mechanistic interpretability. ICLR 2023
- Conmy, A. et al. (2024). How to use and interpret activation patching. arXiv:2404.15255
- Nye, M. et al. (2022). Show your work: Scratchpads for intermediate computation with language models. ICLR 2022
- McLeish, S. et al. (2024). Transformers can do arithmetic with the right embeddings. NeurIPS 2024