Skip to content

feat: Build ecosystem documentation audit tool #567

Description

@kcenon

What

Build a Python CLI tool that validates documentation quality across ecosystem
projects. The tool should check frontmatter, links, SSOT registry consistency,
and README structure compliance.

Why

Manual documentation quality checks do not scale across 8 projects with ~578
documents. An automated audit tool ensures documentation standards are maintained
as projects evolve, catching drift before it accumulates. This is the key to
sustaining documentation quality long-term.

Where

common_system/
└── tools/
    └── doc-audit/
        ├── __init__.py
        ├── __main__.py          # CLI entry point
        ├── config.py            # Constants, patterns, field definitions
        ├── models.py            # Finding, CheckResult dataclasses
        ├── checker_metadata.py  # YAML frontmatter validation
        ├── checker_links.py     # Internal link validity
        ├── checker_ssot.py      # SSOT registry consistency
        ├── checker_structure.py # README standard section compliance
        ├── report.py            # Markdown/JSON report generation
        └── utils.py             # File discovery, path resolution

How

Checker Specifications

1. checker_metadata.py — Frontmatter Validation

  • Verify all docs/ .md files have YAML frontmatter
  • Validate required fields: doc_id, doc_title, doc_version, doc_date, doc_status, project, category
  • Validate doc_id format: {PREFIX}-{CATEGORY}-{NNN}
  • Validate semver format for doc_version
  • Validate ISO 8601 for doc_date
  • Validate doc_status enum values
  • Flag duplicate doc_id values

2. checker_links.py — Link Validation

  • Find all markdown links [text](url) in docs/ files
  • Validate relative paths resolve to existing files
  • Validate anchor references (#section-name) exist in target
  • Report broken links with file and line number

3. checker_ssot.py — Registry Consistency

  • Parse SSOT registry table from docs/README.md
  • Verify each listed file exists
  • Verify each doc_id in registry matches file frontmatter
  • Flag documents in docs/ not listed in registry
  • Flag registry entries pointing to non-existent files

4. checker_structure.py — README Compliance

CLI Interface

# Full audit on a single project
python -m doc_audit /path/to/thread_system

# Quick pass/fail check (for CI)
python -m doc_audit /path/to/thread_system --quick

# Ecosystem-wide audit
python -m doc_audit /path/to/Sources --ecosystem

# Output formats
python -m doc_audit /path/to/thread_system --format json
python -m doc_audit /path/to/thread_system --format markdown
python -m doc_audit /path/to/thread_system --format both

Report Output

# Documentation Audit Report — thread_system

## Summary
| Checker | Findings | Critical | Warning | Info |
|---------|----------|----------|---------|------|
| Metadata | 3 | 0 | 2 | 1 |
| Links | 1 | 1 | 0 | 0 |
| SSOT | 2 | 0 | 1 | 1 |
| Structure | 0 | 0 | 0 | 0 |
| **Total** | **6** | **1** | **3** | **2** |

## Findings
...

Acceptance Criteria

  • tools/doc-audit/ package exists in common_system
  • All 4 checkers implemented and tested
  • CLI supports --quick, --format, --ecosystem flags
  • Markdown and JSON report generation works
  • Tool runs successfully on all 8 ecosystem projects
  • Exit code 1 on critical findings, 0 otherwise (for CI integration)
  • No external dependencies beyond Python 3.10+ stdlib + PyYAML

Related

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions