Skip to content

feat: drastically decrease token use through pre-processing#31

Merged
scode merged 1 commit intomainfrom
distill-filter-jsonl
Feb 27, 2026
Merged

feat: drastically decrease token use through pre-processing#31
scode merged 1 commit intomainfrom
distill-filter-jsonl

Conversation

@scode
Copy link
Copy Markdown
Owner

@scode scode commented Feb 27, 2026

During distillation, the vast majority of JSONL transcript content is tool
machinery invisible to the user (tool results, invocations, progress events,
thinking blocks, file snapshots). Typically only 2-15% is user/assistant text.

Always fail-useful - even to the extreme of preserving the raw input if it doesn't parse as valid JSON.

Copy link
Copy Markdown
Owner Author

scode commented Feb 27, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@scode scode changed the title Pre-process session logs in distill to filter noise feat: drastically decrease token use through pre-processing Feb 27, 2026
During distillation, the vast majority of JSONL transcript content is tool
machinery invisible to the user (tool results, invocations, progress events,
thinking blocks, file snapshots). Typically only 2-15% is user/assistant text.

Filter each log down to the conversational flow — user messages and assistant
text responses — before feeding it to the distilling agent. Uses
serde_json::Value for resilience to schema changes; unknown types and non-JSON
lines fall through as raw content (fail-useful).
@scode scode force-pushed the distill-filter-jsonl branch from df09627 to dcf3f31 Compare February 27, 2026 04:28
@scode scode merged commit 6718151 into main Feb 27, 2026
15 checks passed
@scode scode deleted the distill-filter-jsonl branch March 4, 2026 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant