Skip to content

feat(annotations): --annotations flag preloads annotations from FormatOutput markdown#156

Merged
umputun merged 8 commits intoumputun:masterfrom
tushkanin:feat/preload-annotations
Apr 27, 2026
Merged

feat(annotations): --annotations flag preloads annotations from FormatOutput markdown#156
umputun merged 8 commits intoumputun:masterfrom
tushkanin:feat/preload-annotations

Conversation

@tushkanin
Copy link
Copy Markdown
Contributor

Closes #153.

Summary

Add a --annotations=PATH flag that preloads the annotation store from a markdown file in the same shape Store.FormatOutput emits. The format is bidirectional: any file written by -o can be read back via --annotations, so an LLM-generated review (or hand-authored notes, or a previous session) can be opened directly in the TUI for inline curation.

Per the issue discussion, this lands as a single string flag (not repeatable JSON), regex-strict parser over the existing markdown format, and warn-and-drop policy for orphan records.

What changed

  • app/annotation/parse.go — new Parse for the four header shapes ((+), (-), ( ), (file-level)), including range hunks (:N-M), multi-line bodies, and the inverse of escapeHeaderLines. Symmetric escape/strip rule so any indent depth round-trips losslessly.
  • app/annotations_load.go — preload pipeline. Drops orphans against the resolved diff with an stderr warning (file-level needs file-in-diff, line-scoped needs file + matching DiffLine for the record's change type). Untracked files (which surface in the UI via the show-untracked toggle, not ChangedFiles) are folded in via untrackedFn and validated by reading from disk via diff.ReadFileAsAdded, mirroring Model.resolveEmptyDiff. Last-write-wins via existing Store.Add covers duplicates and preload-then-edit.
  • app/config.go + app/main.go--annotations (no-ini, per-invocation), wired before model construction.
  • Docs — README, site/docs.html, CHANGELOG, and the Claude/codex plugin SKILL/usage references get a Preloading Annotations section and a round-trip example.
  • docs/plans/completed/2026-04-27-preload-annotations.md — implementation plan archived.

Test plan

  • go test ./... (full suite passes; new tests in app/annotation/parse_test.go and app/annotations_load_test.go)
  • golangci-lint run ./app/... (0 issues)
  • Round-trip: revdiff -o review.md HEAD~1, edit externally, revdiff --annotations=review.md HEAD~1 — preloaded state matches saved state
  • Orphan drop: annotations referencing files/lines not in the current diff produce stderr warnings and are skipped
  • Untracked files: annotations saved against an untracked path round-trip through --annotations (regression test TestPreloadAnnotations_UntrackedFile)
  • Hand-authored bodies with indented ## lines round-trip losslessly

Add Parse for the markdown shape Store.FormatOutput emits so annotations
can be reconstructed from previously-saved review files. Make
escapeHeaderLines/parser body-line stripping symmetric on
"left-trimmed prefix is '## '", so pre-indented "## " lines round-trip
losslessly at any indent depth.
Add an opt-in --annotations=PATH flag (no-ini, per-invocation only) that
parses a markdown file in the same shape Store.FormatOutput emits and
preloads the annotation store before the TUI starts. Records whose
file/line are not present in the resolved diff are dropped with a
stderr warning. Untracked files surface in the UI via a separate
loadUntracked callback rather than ChangedFiles, so the preload folds
them in via untrackedFn and reads their line set from disk via
diff.ReadFileAsAdded — mirroring Model.resolveEmptyDiff so any file
written by -o (including untracked) round-trips back through
--annotations.
@tushkanin tushkanin requested a review from umputun as a code owner April 27, 2026 16:26
Add a Preloading Annotations section to README, site docs, and the
Claude/codex plugin usage references; note the bidirectional -o ⇄
--annotations workflow in CHANGELOG and the in-session review preload
path in plugin SKILL files.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new CLI entry point for seeding revdiff’s in-TUI annotation workflow from an existing markdown annotations file, using the same bidirectional format already produced by Store.FormatOutput().

Changes:

  • Introduces --annotations=PATH flag and startup preload pipeline that parses -o-style markdown and populates the annotation store while dropping orphaned records with warnings.
  • Adds a strict markdown parser (annotation.Parse) that round-trips with Store.FormatOutput, including file-level, line-level, and range headers plus escaped ## body lines.
  • Updates documentation and agent skill references to describe round-trip preload workflows; adds tests for parsing and preload/orphan handling.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
app/annotation/parse.go Implements strict parser for FormatOutput markdown records.
app/annotation/parse_test.go Adds comprehensive parser tests including round-trip cases.
app/annotation/store.go Extends header-escaping to cover indented ## lines for symmetric parsing.
app/annotations_load.go Adds preload-and-validate pipeline against resolved diff/untracked files.
app/annotations_load_test.go Adds tests for flag parsing, preload behavior, orphan dropping, and untracked/staged edge cases.
app/config.go Adds --annotations flag to CLI options (no-ini).
app/main.go Wires preload before model construction.
README.md Documents --annotations and round-trip workflow example.
site/docs.html Mirrors README docs: option list and usage example for --annotations.
.claude-plugin/skills/revdiff/SKILL.md Updates skill triggers + guidance for preload workflows.
.claude-plugin/skills/revdiff/references/usage.md Documents “Preloading Annotations” section.
plugins/codex/skills/revdiff/SKILL.md Mirrors skill trigger updates + in-session review guidance.
plugins/codex/skills/revdiff/references/usage.md Mirrors “Preloading Annotations” documentation.
docs/plans/completed/2026-04-27-preload-annotations.md Archives the completed implementation plan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread app/annotations_load.go Outdated
}

// buildLineSet maps each renderable diff line to its (line-number, change-type)
// key. mirrors Model.diffLineNum: removals key on OldNum, all other change
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comment sentence casing: "key. mirrors" should start with a capital letter ("Key. Mirrors") to match Go doc style and improve readability.

Suggested change
// key. mirrors Model.diffLineNum: removals key on OldNum, all other change
// key. Mirrors Model.diffLineNum: removals key on OldNum, all other change

Copilot uses AI. Check for mistakes.
Comment thread app/annotations_load_test.go Outdated
func TestPreloadAnnotations_DropsOrphans(t *testing.T) {
body := "## a.go (file-level)\nfile-level note\n\n" +
"## a.go:5 (+)\nline-add note\n\n" +
"## a.go:99 (+)\northerwise valid file but bad line\n\n" +
Copy link

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the test annotation body string: "ortherwise" should be "otherwise" (helps keep test fixtures readable/searchable).

Suggested change
"## a.go:99 (+)\northerwise valid file but bad line\n\n" +
"## a.go:99 (+)\notherwise valid file but bad line\n\n" +

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner

@umputun umputun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests / lint / race green. parser tests are well shaped: table-driven, round-trip property test, path-with-colon edge case all covered. before merge a few things worth addressing:

block:

  1. sanitize comment text on parse. Annotation.Comment flows verbatim from file into Store.Add, then through Renderer.AnnotationInline which wraps raw bytes in ANSI italic. The PR's headline use case is LLM-generated review.md, and the new skill flow in SKILL.md ("Opening an In-Session Review") makes Claude itself the agent producing the file from arbitrary conversation context. That context can plausibly contain raw \x1b[...] sequences, CR-overwrite tricks, or stray escapes (terminal output pasted into chat, error messages with color codes, code samples that discuss ANSI rendering). Without sanitization those bytes reach the TUI render unchanged. The codebase already has diff.sanitizeCommitText for exactly this threat model (commit author/subject/body); route every parsed Comment through it before store.Add. About 5 lines. If #155 lands first, just import the now-exported diff.SanitizeCommitText. If this lands first, export the symbol here or duplicate the strip table.

  2. harden file read to match #155. app/annotations_load.go:26 is naked os.Open(path) with nolint:gosec. Mirror app/main.go:resolveDescription from #155: os.Stat first, reject non-regular files via IsRegular() (otherwise opening a FIFO blocks the process before bubbletea initializes), then os.Open, then io.LimitReader(f, max+1) for an upper bound. 1 MiB cap is reasonable here since annotation files aggregate many records. Less critical for the skill-driven temp-file path since the file is skill-controlled, but matters for curl'd review files and edited history files.

should fix:

  1. extract a preloader struct, fix params and standalones in one move. the wide-signature problem and the standalone-helpers problem are the same problem.

    • preloadAnnotations (8 params), lookupLineSet (8 params), resolveKnownFiles (5 params), buildLineSet. Project rule is 4+ → option struct.
    • all four are called only from each other (single caller for each helper). Project rule is "functions called only from methods MUST be methods, prefer methods over standalone helpers."
    • one refactor closes both: type preloader struct { renderer, ref, staged, untrackedFn, workDir, warnOut, lineCache }, then (p *preloader) load, (p *preloader) resolveKnownFiles, (p *preloader) lookupLineSet, (p *preloader) buildLineSet. tests collapse from "", false, nil, "", &bytes.Buffer{} repetition to a single preloader{...} literal.

    parseHeader in app/annotation/parse.go:99 is the same shape (standalone, single caller Parse). If Parse becomes Store.Load per #4, parseHeader naturally becomes a method on whatever parser type that introduces, or stays as an unexported helper local to Load.

discuss:

  1. ParseStore.Load / Store.Merge. Right now Parse returns []Annotation and preloadAnnotations re-feeds each through store.Add. A method on Store would close the round-trip loop with Store.FormatOutput symmetrically on the API surface, not just on the format level, and remove the for-loop in the loader.

  2. skill instruction underdocs the body-escape rule. SKILL.md tells Claude to write the temp file "using the format documented in references/usage.md (Output Format section)". That section has the ## ## body-escape rule (line 245), but it's tucked into a paragraph about output. When Claude generates fresh content for --annotations, it can easily emit a body line starting with ## without escaping, which the strict parser rejects (or worse, silently splits the record). Either add an explicit "escape ## in body lines" callout next to the skill instruction, or relax the parser to tolerate unescaped ## in non-header positions. A regression test covering "body line starts with unescaped ## " would also help.

  3. renamed-file silent drop. If a file was renamed since --annotations was generated, records key on the old path and get orphan-dropped. Known limit of the format keyed on path. A one-line note in resolveKnownFiles doc-comment saves the next reader from chasing it as a bug.

  4. --stdin --annotations is undefined. Preload runs after stdin payload setup so file-level records keyed on the synthetic stdin name probably work; line-scoped records orphan-drop. Either add to the stdin mutual-exclusion list in parseArgs or document as supported.

nits:

  • app/annotations_load.go:48,57,85,99,142, five occurrences of _, _ = fmt.Fprintf(warnOut, ...). Once the option-struct refactor lands, a (p *preloader) warnf(...) helper collapses those.
  • app/annotations_load.go:81-93, doc-comment says "matches ui.loadFiles" but loadFiles only folds untracked when showUntracked is on; preload always folds. Clarify the divergence.

Public wrapper around sanitizeCommitText so callers outside the diff
package (e.g. preloaded annotation comments) can apply the same
ANSI / C1 / control-byte stripping before content reaches a terminal
renderer.
- Move parseHeader onto a parser struct so state (out/current/body/
  seenHeader) lives on the value rather than as closure-captured locals
  in Parse, matching the project rule of methods over standalone helpers.
- Tolerate "## " body lines that don't match the header grammar by
  folding them into the current record. Hand-authored or LLM-generated
  bodies often mention "## something" without the leading-space escape;
  before the first header the grammar error still propagates.
- Add Store.Load(io.Reader) as the symmetric inverse of
  Store.FormatOutput on the API surface. Callers that need to filter
  records (preloader) keep using Parse + Add directly.
- Collapse preloadAnnotations / resolveKnownFiles / lookupLineSet into
  methods on a preloader struct so call sites stop threading 5-8
  parameters through and the warn-printf calls fold into a warnf
  helper. Per project rule: functions called only from each other
  belong on a shared receiver.
- Harden file read: Stat first, reject non-regular files (FIFO would
  block os.Open before bubbletea initializes), reject oversize inputs
  up front against info.Size() with a 1 MiB cap, layer io.LimitReader
  as belt-and-braces during parse.
- Sanitize parsed Comment text via diff.SanitizeCommitText before
  Store.Add. Preload sources are user- or LLM-supplied and may carry
  stray ANSI / CR-overwrite / C1 bytes that the TUI's
  Renderer.AnnotationInline would wrap into the screen verbatim.
- Doc-comment in resolveKnownFiles now calls out the deliberate
  divergence from ui.loadFiles (always-fold-untracked vs UI's
  toggle-gated fold) and the renamed-file orphan-drop limitation.
- Tests for FIFO rejection, oversize file, and ANSI/CR sanitization.
Line-scoped annotations preloaded against a synthetic stdin buffer
have no real diff to validate against and would orphan-drop on every
record. File-level records keyed on the synthetic stdin name happen
to work, but documenting half-support is worse than excluding the
combination outright. Joins the existing mutually-exclusive list with
refs / --staged / --only / --all-files / --include / --exclude.
Copy link
Copy Markdown
Owner

@umputun umputun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all addressed cleanly: sanitization wired through diff.SanitizeCommitText, file-read hardened with Stat→IsRegular→LimitReader and 1 MiB cap, preloader struct holds the state, Store.Load closes the round-trip, parser tolerates unescaped ## after the first header (with regression test), --stdin --annotations rejected, doc-comments updated.

verified locally with the built binary against rejection paths (FIFO, dir, oversize, malformed, --stdin mutex), orphan-drop warning, and the unescaped ## tolerance. tests / lint / race green.

LGTM.

@umputun umputun merged commit 6d4f16a into umputun:master Apr 27, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: --annotation flag to preload annotations (LLM review workflow)

3 participants