Skip to content

efradeca/freepaperbanana

FreePaperBanana

Generate publication-ready academic diagrams from methodology text using AI.

License: MIT Python 3.10+ Tests CI HF Space GitHub stars

Open-source reimplementation of PaperBanana (Zhu et al., 2026). Six specialized agents generate methodology diagrams and statistical plots via a two-phase pipeline: planning (Prompt Enhancer → Retriever → Planner → Stylist) and iterative refinement (Visualizer ↔ Critic). Powered by Google Gemini.

Disclaimer: Unofficial, independent reimplementation for research and education. Not affiliated with or endorsed by the original authors or Google/DeepMind.


Demo

Text prompt → iterative refinement → publication-ready diagram:

Iterative refinement demo: VLM-as-Judge evaluation framework

VLM-as-Judge Framework Pipeline Architecture Segmentation Quantum
Eval Pipeline Segmentation Quantum
Score: 9.2/10 Score: 9.8/10 Comparison grid Surface code

Each diagram generated from text in ~3 minutes for ~$0.50. See example prompts for the full scored prompts.


Why This Tool?

  • Free and open source. No $300/year subscription. Bring your own Gemini API key (~$0.50 per diagram).
  • Publication-ready output. Exports to PNG (300 DPI). Designed for NeurIPS/ICML-style figures.

Quick Start

Install

git clone https://github.com/efradeca/freepaperbanana.git
cd freepaperbanana
pip install -e ".[dev]"

Setup API key

freepaperbanana setup
# Opens browser to get a free Gemini API key → saves to .env

Generate a diagram

import asyncio
from freepaperbanana import FreePaperBananaPipeline, GenerationInput, DiagramType

async def main():
    pipeline = FreePaperBananaPipeline()
    result = await pipeline.generate(
        GenerationInput(
            source_context="We propose a transformer-based encoder-decoder...",
            communicative_intent="Overview of the proposed architecture.",
            diagram_type=DiagramType.METHODOLOGY,
        )
    )
    print(f"Output: {result.image_path}")

asyncio.run(main())

Try it online: HF Space demo (no installation needed)


Examples

1. Methodology Diagram

result = await pipeline.generate(
    GenerationInput(
        source_context="""
        We propose a multi-agent framework consisting of five specialized agents.
        A Retriever selects reference examples, a Planner generates textual descriptions
        via in-context learning, and a Stylist refines them for aesthetics. In the
        refinement phase, a Visualizer renders images and a Critic evaluates them,
        iterating for T rounds.
        """,
        communicative_intent="Overview of the multi-agent illustration pipeline.",
        diagram_type=DiagramType.METHODOLOGY,
    )
)

2. Model Architecture

result = await pipeline.generate(
    GenerationInput(
        source_context="""
        Our model uses a U-Net architecture with skip connections. The encoder
        downsamples through 4 stages of conv-batchnorm-relu blocks. The decoder
        mirrors this with transposed convolutions. Skip connections concatenate
        encoder features at each scale.
        """,
        communicative_intent="U-Net encoder-decoder with skip connections.",
        diagram_type=DiagramType.METHODOLOGY,
    )
)

3. Statistical Plot

freepaperbanana plot \
  -d results.csv \
  --intent "Grouped bar chart comparing F1 scores across 4 models on 3 benchmarks"

Supported Figure Types

Type Status Description
Methodology diagrams Ready Flowcharts, pipelines, system architectures
Model architectures Ready Neural network structures, encoder-decoders
Process diagrams Ready Multi-stage workflows, data flows
Training pipelines Ready Training loops, loss flows, optimization steps
Comparison figures Ready Side-by-side method comparisons, ablation visuals
Statistical plots Ready Bar charts, line plots, scatter plots (via matplotlib)
Multi-panel figures Partial Limited by aspect ratio constraints
Tables / algorithms Coming soon Not yet supported

Export Formats

Format Resolution Recommended Use
PNG 300 DPI Paper submissions, presentations

Roadmap

  • Multi-provider support (OpenAI, Anthropic alongside Gemini)
  • Multi-panel figure generation with sub-figure layout
  • PyPI package (pip install freepaperbanana)
  • Algorithm/pseudocode rendering
  • Interactive editing — click to modify specific diagram elements

Contributing

Contributions welcome! See CONTRIBUTING.md for setup, conventions, and PR guidelines.

Look for issues labeled good first issue to get started.


Testing

pytest tests/ -v          # 179 tests, all mocked (no API key needed)
ruff check src/ tests/    # Lint
ruff format --check src/  # Format check

Legal

  • License: MIT (code) / CC BY 4.0 + Public Domain (reference images).
  • Independent implementation: Inspired by the published paper (arXiv:2601.23265). No code was copied from any existing repository. The llmsresearch/paperbanana community project (MIT) was consulted as architectural reference.
  • Generated outputs: Images produced by this pipeline are generated via the Google Gemini API. Users are responsible for compliance with Google's Generative AI Terms of Service.

Citation

If you use FreePaperBanana in your research, please cite the original paper:

@article{zhu2026paperbanana,
  title={PaperBanana: Automating Academic Illustration for AI Scientists},
  author={Zhu, Dawei and Meng, Rui and Song, Yale and Wei, Xiyu and Li, Sujian and Pfister, Tomas and Yoon, Jinsung},
  journal={arXiv preprint arXiv:2601.23265},
  year={2026}
}
@software{freepaperbanana2026,
  title={FreePaperBanana: Open-Source Multi-Agent Academic Illustration Generation},
  author={Deulofeu, Efrain},
  year={2026},
  url={https://github.com/efradeca/freepaperbanana},
  license={MIT}
}

Acknowledgments

  • PaperBanana (Zhu et al., 2026) for the original methodology.
  • 331 reference figures from 80 published papers (all CC BY 4.0 / Public Domain). See THIRD_PARTY_NOTICES.md. Authors may request removal via Issues.
  • llmsresearch/paperbanana community reimplementation (MIT), consulted as architectural reference.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages