Skip to content

bug: frontmatter --fix leaves .bak files in source trees #902

@100yenadmin

Description

@100yenadmin

TL;DR

gbrain frontmatter validate --fix and gbrain frontmatter generate --fix currently write <file>.bak beside every repaired markdown file, which can flood a user's source tree with hundreds of backup files during one setup run. Bulk frontmatter repair should still be safe and reversible, but the backups should live in GBrain-owned runtime storage, not inside the user's docs/workspace repo.

flowchart TD
  A["Agent runs\nfrontmatter --fix"] --> B["Hundreds of markdown files"]
  B --> C["Current behavior:\nwrite file.md.bak beside source"]
  C --> D["Workspace pollution\n.gitignore surprises\nagent/customer confusion"]

  B --> E["Desired behavior:\ncopy original into ~/.gbrain/backups/frontmatter/<run>/<source-hash>/..."]
  E --> F["Source tree stays clean\nrollback remains possible"]
Loading

Why This Matters

Agents often run setup or migration commands across broad document trees. In that shape, sidecar .bak files are not just cosmetic noise:

  • they double the apparent document count for users and agents
  • they can accidentally enter future imports, search scopes, or editor views
  • they create support confusion: users think GBrain duplicated their workspace
  • they make one repair pass look like hundreds of unrelated file changes

The command already has the right safety principle: never rewrite without a backup. The issue is only the backup location.

Related But Separate From #922

PR #922 fixes walker hygiene by skipping .git, node_modules, .obsidian, and similar directories. That is useful and compatible, but it does not solve the backup side effect. Even when the walker visits only legitimate markdown files, --fix can still create hundreds of sidecar .bak files.

Secondary Problem: Catch-All type: note

frontmatter generate --fix can also stamp arbitrary unknown files with the catch-all rule:

---
title: <filename>
type: note
---

If every unknown document becomes type: note, the type field stops carrying information. The safer behavior is:

  • infer meaningful types for known directories like docs/runbooks/, docs/guides/, docs/projects/, support/, etc.
  • skip unknown catch-all files by default
  • allow the old behavior only with an explicit flag such as --include-catch-all

Acceptance Criteria

  • frontmatter validate --fix writes backups under ~/.gbrain/backups/frontmatter/..., not beside source files.
  • frontmatter generate --fix uses the same centralized backup path.
  • The backup path preserves enough source-relative structure to recover the original file.
  • Existing safety remains: no in-place rewrite happens without first copying the original.
  • Unknown catch-all type: note generation is opt-in, not default.
  • CLI output tells users where backups went.
  • Tests assert that <file>.bak is not created in the source tree.

Agent Implementation Notes

A small implementation should touch only the frontmatter/write path:

  • add a centralized backup helper near the existing brain writer/frontmatter utilities
  • call it from frontmatter validate --fix
  • call it from frontmatter generate --fix
  • keep --dry-run write-free
  • add tests that set GBRAIN_HOME to a temp dir and assert backups land under that temp runtime home

Do not remove the backup safety contract. Move it from user source trees into GBrain-managed runtime storage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions