Skip to content

feat(cli): add ov doctor diagnostic command#851

Merged
qin-ctx merged 3 commits intovolcengine:mainfrom
mvanhorn:osc/feat-ov-doctor-diagnostic-command
Mar 23, 2026
Merged

feat(cli): add ov doctor diagnostic command#851
qin-ctx merged 3 commits intovolcengine:mainfrom
mvanhorn:osc/feat-ov-doctor-diagnostic-command

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

@mvanhorn mvanhorn commented Mar 21, 2026

Description

Adds ov doctor, a local diagnostic command that validates all OpenViking subsystems and reports pass/fail with actionable fix suggestions. Unlike ov health (which requires a running server), ov doctor works offline and checks prerequisites before the server even starts.

Why this matters: After installing OpenViking v0.2.8 via pip on Python 3.13, the native vector engine (PersistStore) was missing from the wheel. The only feedback was 50+ ERROR log lines. Filesystem operations (ls, tree, add_resource) all worked fine, but search and find silently returned nothing. ov doctor catches this immediately with a clear fix suggestion.

Related issues:

  • #420 - integration test coverage for all API commands
  • #693 - user couldn't run on VM without AVX instructions, no diagnostic told them why
  • #737 - SSE3 variant contained AVX-512 instructions, crash with no clear error

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

  • Created openviking_cli/doctor.py with 7 isolated checks: config, Python version, native engine, AGFS, embedding, VLM, disk space
  • Modified openviking_cli/rust_cli.py to intercept doctor as a Python-native subcommand before exec'ing the Rust binary
  • Added 17 unit tests in tests/cli/test_doctor.py

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

Screenshots

All checks passing (installed from modified branch, ran ov doctor via CLI entry point):

ov doctor - all pass

Failure state (native engine missing + no VLM API key):

ov doctor - failures

Each failing check shows a specific fix suggestion. Checks are isolated so one failure doesn't block others.

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes

This contribution was developed with AI assistance (Claude Code). The feature was proposed after dogfooding OpenViking locally and discovering the PersistStore gap firsthand.

Adds a new `ov doctor` command that validates all OpenViking subsystems
and reports actionable diagnostics without requiring a running server.

Checks: config file, Python version, native vector engine (PersistStore),
AGFS, embedding provider, VLM provider, and disk space. Each check is
isolated so one failure doesn't block others, and every failure includes
a specific fix suggestion.

This addresses a real pain point: when the native engine is missing from
a pip wheel (e.g., Python 3.13), the only feedback is 50+ ERROR log
lines with no actionable guidance. `ov doctor` catches this immediately:

  Native Engine: FAIL  No compatible engine variant
    Fix: pip install openviking --upgrade --force-reinstall
    Alt: Use vectordb.backend = "volcengine" instead of "local"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@qin-ctx qin-ctx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ov doctor is a useful addition addressing real user pain points (#693, #737). The isolated check design and test coverage are solid.

Main concern: config resolution logic is reimplemented instead of reusing the existing resolve_config_path() from openviking_cli/utils/config/config_loader.py. See inline comments for details.

_CONFIG_SEARCH_PATHS = [
Path(os.environ.get("OPENVIKING_CONFIG_FILE", "")),
Path.home() / ".openviking" / "ov.conf",
Path("/etc/openviking/ov.conf"),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Design] (blocking)

_CONFIG_SEARCH_PATHS and _find_config() reimplement the four-level config resolution chain that already exists in openviking_cli/utils/config/config_loader.py:resolve_config_path(). This creates two sources of truth for config file discovery — if the resolution chain changes (e.g., a new search path is added), this code won't be updated.

Additionally, _CONFIG_SEARCH_PATHS is evaluated at module import time, so the OPENVIKING_CONFIG_FILE env var value is captured once at import and won't reflect later changes. The existing resolve_config_path() evaluates the env var at call time, avoiding this issue.

Suggested fix:

from openviking_cli.utils.config.config_loader import resolve_config_path
from openviking_cli.utils.config.consts import OPENVIKING_CONFIG_ENV, DEFAULT_OV_CONF

def _find_config() -> Optional[Path]:
    return resolve_config_path(None, OPENVIKING_CONFIG_ENV, DEFAULT_OV_CONF)

False,
f"{provider}/{model} (no API key)",
"Set embedding.dense.api_key in ov.conf or OPENAI_API_KEY env var",
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Design] (non-blocking)

Both check_embedding() and check_vlm() (line 193) hardcode OPENAI_API_KEY as the fallback env var for all providers. If a user configures a non-OpenAI provider (e.g., volcengine) and sets the provider-specific env var, ov doctor will incorrectly report "no API key".

Consider either:

  • Looking up provider-specific env vars (a mapping of provider → env var name), or
  • Only checking for the api_key field in the config, without falling back to env vars (simpler and avoids false negatives)

2. Wheel 自带:{package_dir}/openviking/bin/ov
3. PATH 查找:系统全局安装的 ov
"""
# 0. Python-native subcommands (no Rust binary needed)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] (non-blocking)

The docstring above was correctly renumbered (0=doctor, 1=dev, 2=wheel, 3=PATH), but the existing inline comments below (# 0. 检查开发环境, # 1. 检查 Wheel 自带, # 2. 检查 PATH) still use the old numbering. Consider renumbering them to # 1., # 2., # 3. for consistency.

…KEY fallback

Address review feedback from @qin-ctx:
- Replace _CONFIG_SEARCH_PATHS and _find_config() with resolve_config_path()
  from config_loader.py to avoid two sources of truth for config discovery
- Remove hardcoded OPENAI_API_KEY env var fallback from embedding and VLM
  checks - only check the api_key field in config
- Renumber inline comments in rust_cli.py (1/2/3 matching docstring)
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Fixed in a74982f:

  • Replaced _CONFIG_SEARCH_PATHS / _find_config() with resolve_config_path() from config_loader.py - single source of truth for the four-level config chain
  • Dropped the OPENAI_API_KEY env var fallback from both check_embedding() and check_vlm() - now only checks the api_key field in config
  • Renumbered inline comments in rust_cli.py to match the updated docstring

@qin-ctx qin-ctx merged commit 50e1ff9 into volcengine:main Mar 23, 2026
5 of 6 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 23, 2026
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks for the review and merge!

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review, @qin-ctx - the suggestions on the diagnostic output format improved the UX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants