A tool that splits Markdown files at H2 headings and summarizes each section using an LLM.
The tool is broken into three commands for flexible workflows:
Splits a Markdown file into sections (at H2 headings) and saves them to a JSON file:
# Basic usage
python summarize.py load input.md
# Specify output JSON file
python summarize.py load input.md --output_file=sections.json
# Enable verbose output
python summarize.py load input.md --verbose
Processes the JSON file, generating summaries for each section and saving progress after each:
# Basic usage
python summarize.py process sections.json
# Use a specific LLM model
python summarize.py process sections.json --model=gpt-4o-mini
# Enable verbose output
python summarize.py process sections.json --verbose
Converts the processed JSON back into a summarized Markdown file:
# Basic usage
python summarize.py save sections.json
# Specify output Markdown file
python summarize.py save sections.json --output_file=summary.md
# Enable verbose output
python summarize.py save sections.json --verbose
Runs all steps in sequence:
# Basic usage
python summarize.py all input.md
# Specify output file and model
python summarize.py all input.md --output_file=summary.md --model=gpt-4o-mini
# Enable verbose output
python summarize.py all input.md --verbose
Thanks to a built-in shebang line, you can also run the script directly with uv:
uv run summarize.py load input.md
Splitting the process into steps helps with:
Uses the llm library to connect to various providers. Tries these models in order:
gpt-4o-mini (OpenAI)openrouter/google/gemini-flash-1.5openrouter/openai/gpt-4o-minihaiku (Claude 3 Haiku)Override the default with the --model flag.
Input:
# My Document
## 7. Introduction
This is an introduction to my document.
## 8. Section 1
This is the first section with important details.
## 9. Section 2
This is the second section with more information.
Output:
```markdown
Summary of the introduction…
Summary of section 1…
##