feat: validate Codex project_doc_fallback_filenames semantics#569
feat: validate Codex project_doc_fallback_filenames semantics#569
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new Codex validation rule (CDX-006) to enforce semantic correctness of .codex/config.toml’s project_doc_fallback_filenames, and propagates the new rule through rule registries, docs, tests, and i18n so parity checks stay consistent.
Changes:
- Implement CDX-006 parsing + validation for
project_doc_fallback_filenames(type, non-strings, empties, duplicates, path-like values). - Add/propagate rule metadata across knowledge-base + agnix-rules rule registries and update global rule counts (230 → 231).
- Add tests and update locale/message catalogs + documentation references to include the new rule/count.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| locales/zh-CN.yml | Adds zh-CN messages for CDX-006. |
| locales/es.yml | Adds Spanish messages for CDX-006. |
| locales/en.yml | Adds English messages for CDX-006. |
| knowledge-base/rules.json | Adds CDX-006 rule entry, bumps total rule count/date, updates category counts. |
| knowledge-base/VALIDATION-RULES.md | Documents CDX-006 and updates rule-count tables/coverage totals. |
| knowledge-base/README.md | Updates displayed rule count to 231. |
| knowledge-base/INDEX.md | Updates displayed rule count to 231 in multiple index locations. |
| editors/vscode/README.md | Updates VS Code extension docs to reflect 231 rules. |
| crates/agnix-rules/rules.json | Mirrors knowledge-base rule additions/count updates for parity. |
| crates/agnix-mcp/tests/mcp_tests.rs | Updates MCP test asserting total rule count (231). |
| crates/agnix-lsp/locales/zh-CN.yml | Adds zh-CN messages for CDX-006 (LSP). |
| crates/agnix-lsp/locales/es.yml | Adds Spanish messages for CDX-006 (LSP). |
| crates/agnix-lsp/locales/en.yml | Adds English messages for CDX-006 (LSP). |
| crates/agnix-lsp/README.md | Updates LSP README rule count (231). |
| crates/agnix-core/src/schemas/codex.rs | Extends Codex TOML parsing to extract/track project_doc_fallback_filenames issues and adds unit tests. |
| crates/agnix-core/src/rules/codex.rs | Implements CDX-006 diagnostics + tests; adds helpers for “suspicious” entries. |
| crates/agnix-core/locales/zh-CN.yml | Adds core zh-CN messages for CDX-006. |
| crates/agnix-core/locales/es.yml | Adds core Spanish messages for CDX-006. |
| crates/agnix-core/locales/en.yml | Adds core English messages for CDX-006. |
| crates/agnix-cli/locales/zh-CN.yml | Adds CLI zh-CN messages for CDX-006. |
| crates/agnix-cli/locales/es.yml | Adds CLI Spanish messages for CDX-006. |
| crates/agnix-cli/locales/en.yml | Adds CLI English messages for CDX-006. |
| SPEC.md | Updates rule count references (231; Codex CLI 7 rules). |
| CLAUDE.md | Updates rule count references to 231. |
| CHANGELOG.md | Updates rule count references to 231. |
| AGENTS.md | Updates rule count references to 231. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| bytes.len() >= 3 | ||
| && bytes[0].is_ascii_alphabetic() | ||
| && bytes[1] == b':' | ||
| && (bytes[2] == b'\\' || bytes[2] == b'/') |
There was a problem hiding this comment.
is_windows_absolute_path() is currently redundant: any string that returns true from it will also contain '/' or '\', which are already checked in is_suspicious_fallback_filename(). Consider either removing the helper, or broadening the Windows detection to catch drive-relative forms like C:foo/C:foo\\bar (which wouldn't be flagged by the current checks) so the extra logic is actually meaningful.
| bytes.len() >= 3 | |
| && bytes[0].is_ascii_alphabetic() | |
| && bytes[1] == b':' | |
| && (bytes[2] == b'\\' || bytes[2] == b'/') | |
| bytes.len() >= 2 | |
| && bytes[0].is_ascii_alphabetic() | |
| && bytes[1] == b':' |
| /// Zero-based indexes of non-string entries in `project_doc_fallback_filenames` | ||
| pub project_doc_fallback_filename_non_string_indices: Vec<usize>, | ||
| /// Zero-based indexes of empty/whitespace-only entries in `project_doc_fallback_filenames` |
There was a problem hiding this comment.
project_doc_fallback_filename_*_indices are documented as zero-based, but the diagnostics intentionally display them as 1-based (idx + 1). This mismatch is easy to trip over in future refactors/tests; either store 1-based indices in ParsedCodexConfig or update the field docs to explicitly note that diagnostics are rendered 1-based.
| /// Zero-based indexes of non-string entries in `project_doc_fallback_filenames` | |
| pub project_doc_fallback_filename_non_string_indices: Vec<usize>, | |
| /// Zero-based indexes of empty/whitespace-only entries in `project_doc_fallback_filenames` | |
| /// Zero-based indexes of non-string entries in `project_doc_fallback_filenames`. | |
| /// Note: these are stored as 0-based internally; user-facing diagnostics render them as 1-based (`idx + 1`). | |
| pub project_doc_fallback_filename_non_string_indices: Vec<usize>, | |
| /// Zero-based indexes of empty/whitespace-only entries in `project_doc_fallback_filenames`. | |
| /// Note: these are stored as 0-based internally; user-facing diagnostics render them as 1-based (`idx + 1`). |
| duplicates.sort(); | ||
| duplicates.dedup(); | ||
| for filename in duplicates { | ||
| diagnostics.push( | ||
| Diagnostic::warning( | ||
| path.to_path_buf(), | ||
| line, | ||
| 0, | ||
| "CDX-006", | ||
| t!("rules.cdx_006.duplicate", value = filename.as_str()), | ||
| ) | ||
| .with_suggestion(t!("rules.cdx_006.suggestion")), | ||
| ); | ||
| } | ||
|
|
||
| let mut suspicious: Vec<String> = filenames | ||
| .iter() | ||
| .map(|name| name.trim()) | ||
| .filter(|name| is_suspicious_fallback_filename(name)) | ||
| .map(|name| name.to_string()) | ||
| .collect(); | ||
| suspicious.sort(); | ||
| suspicious.dedup(); | ||
|
|
||
| for filename in suspicious { | ||
| diagnostics.push( | ||
| Diagnostic::warning( | ||
| path.to_path_buf(), | ||
| line, | ||
| 0, | ||
| "CDX-006", | ||
| t!("rules.cdx_006.suspicious", value = filename.as_str()), | ||
| ) | ||
| .with_suggestion(t!("rules.cdx_006.suggestion")), | ||
| ); | ||
| } |
There was a problem hiding this comment.
CDX-006 is documented as a HIGH-severity rule with a MUST requirement for uniqueness/bare filenames, but duplicates and path-like entries currently emit Diagnostic::warning. Either update the rule documentation/severity to reflect that these are warnings, or upgrade these diagnostics to errors so runtime behavior matches the documented MUST semantics.
Summary
ules.json, VALIDATION-RULES.md, locale files, rule counts)
Validation
Closes #564