Skip to content

[Feature]: Support post-generation adjustment of .overview.md #468

@dawsongzhao0523

Description

@dawsongzhao0523

Problem Statement

After the model generates .overview.md and .abstract.md for resources, there is no practical way to adjust or correct them. Situations include:

  • Generated content is inaccurate or suboptimal
  • Need Chinese output but prompts default to English
  • Need fine-grained control over specific directory/file summaries

Current limitations:

  1. Manual editing: If editing .overview.md directly on disk (when using Local AGFS), the vectordb still has old embeddings — search results may be inconsistent.
  2. Regeneration: No dedicated CLI/API for regenerating overview for a specific URI. The only workaround is re-running ov add-resource for the entire directory, which re-parses everything and is heavy.
  3. No write API: No exposed HTTP/CLI to overwrite .overview.md and trigger re-embedding.

Proposed Solution

  1. Manual edit + re-embed sync

    • Provide a way to trigger re-embedding after editing .overview.md / .abstract.md.
    • Example: ov re-embed viking://resources/xxx/.overview.md or ov re-embed viking://resources/xxx/ to re-vectorize the modified content.
  2. Targeted regeneration

    • Add a CLI command such as ov regenerate-overview <uri> that only re-runs semantic generation (L0/L1) and re-embedding for the specified URI, without re-parsing the full resource.
  3. Optional: write API

    • Expose an API to overwrite .overview.md (or .abstract.md) and optionally trigger re-embedding in one step.

Alternatives Considered

  • Re-add via ov add-resource: works but re-processes the entire directory; acceptable for small resources but expensive for large ones.
  • Direct file edit without re-embed: feasible with Local AGFS, but vectordb will be out of sync.

Feature Area

  • Model Integration (regeneration via VLM)
  • Storage/VectorDB (re-embedding)
  • CLI Tools (new commands)
  • Filesystem Operations (write/overview overwrite)

Use Case

  1. User adds a Chinese documentation directory; the system outputs an English overview. User wants to correct or localize the overview.
  2. For critical directories, user wants to manually refine the overview content for better accuracy.
  3. User needs to fix model hallucination or suboptimal summaries without re-importing the entire resource.

Example API (Optional)

# Targeted regeneration (re-run overview generation + re-embed for a directory)
ov regenerate-overview viking://resources/my_project/doc_dir

# Re-embed only (update vectordb after manual edit)
ov re-embed viking://resources/my_project/doc_dir/

Additional Context

  • Overview generation uses prompts in openviking/prompts/templates/semantic/overview_generation.yaml and parsing/context_generation.yaml, which do not specify output language.
  • VikingFS.write_file() exists internally but is not exposed via HTTP or CLI.
  • Semantic generation + vectorization is done by SemanticProcessor via _generate_overview() and _vectorize_directory_simple().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions