Skip to content

feat: validate Codex project_doc_fallback_filenames semantics#569

Merged
avifenesh merged 2 commits intomainfrom
issue-564-codex-fallback
Feb 26, 2026
Merged

feat: validate Codex project_doc_fallback_filenames semantics#569
avifenesh merged 2 commits intomainfrom
issue-564-codex-fallback

Conversation

@avifenesh
Copy link
Collaborator

Summary

  • add new Codex rule CDX-006 for .codex/config.toml project_doc_fallback_filenames
  • validate field semantics: array type, non-string entries, empty entries, duplicates, and path-like values
  • keep rule/docs/locales parity in sync (
    ules.json, VALIDATION-RULES.md, locale files, rule counts)

Validation

  • python scripts/check-rule-counts.py
  • cargo test -p agnix-core rules::codex
  • cargo test -p agnix-core schemas::codex
  • cargo test -p agnix-mcp test_rules_count
  • cargo test -p agnix-rules parity

Closes #564

Copilot AI review requested due to automatic review settings February 26, 2026 11:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Codex validation rule (CDX-006) to enforce semantic correctness of .codex/config.toml’s project_doc_fallback_filenames, and propagates the new rule through rule registries, docs, tests, and i18n so parity checks stay consistent.

Changes:

  • Implement CDX-006 parsing + validation for project_doc_fallback_filenames (type, non-strings, empties, duplicates, path-like values).
  • Add/propagate rule metadata across knowledge-base + agnix-rules rule registries and update global rule counts (230 → 231).
  • Add tests and update locale/message catalogs + documentation references to include the new rule/count.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
locales/zh-CN.yml Adds zh-CN messages for CDX-006.
locales/es.yml Adds Spanish messages for CDX-006.
locales/en.yml Adds English messages for CDX-006.
knowledge-base/rules.json Adds CDX-006 rule entry, bumps total rule count/date, updates category counts.
knowledge-base/VALIDATION-RULES.md Documents CDX-006 and updates rule-count tables/coverage totals.
knowledge-base/README.md Updates displayed rule count to 231.
knowledge-base/INDEX.md Updates displayed rule count to 231 in multiple index locations.
editors/vscode/README.md Updates VS Code extension docs to reflect 231 rules.
crates/agnix-rules/rules.json Mirrors knowledge-base rule additions/count updates for parity.
crates/agnix-mcp/tests/mcp_tests.rs Updates MCP test asserting total rule count (231).
crates/agnix-lsp/locales/zh-CN.yml Adds zh-CN messages for CDX-006 (LSP).
crates/agnix-lsp/locales/es.yml Adds Spanish messages for CDX-006 (LSP).
crates/agnix-lsp/locales/en.yml Adds English messages for CDX-006 (LSP).
crates/agnix-lsp/README.md Updates LSP README rule count (231).
crates/agnix-core/src/schemas/codex.rs Extends Codex TOML parsing to extract/track project_doc_fallback_filenames issues and adds unit tests.
crates/agnix-core/src/rules/codex.rs Implements CDX-006 diagnostics + tests; adds helpers for “suspicious” entries.
crates/agnix-core/locales/zh-CN.yml Adds core zh-CN messages for CDX-006.
crates/agnix-core/locales/es.yml Adds core Spanish messages for CDX-006.
crates/agnix-core/locales/en.yml Adds core English messages for CDX-006.
crates/agnix-cli/locales/zh-CN.yml Adds CLI zh-CN messages for CDX-006.
crates/agnix-cli/locales/es.yml Adds CLI Spanish messages for CDX-006.
crates/agnix-cli/locales/en.yml Adds CLI English messages for CDX-006.
SPEC.md Updates rule count references (231; Codex CLI 7 rules).
CLAUDE.md Updates rule count references to 231.
CHANGELOG.md Updates rule count references to 231.
AGENTS.md Updates rule count references to 231.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +399 to +402
bytes.len() >= 3
&& bytes[0].is_ascii_alphabetic()
&& bytes[1] == b':'
&& (bytes[2] == b'\\' || bytes[2] == b'/')
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_windows_absolute_path() is currently redundant: any string that returns true from it will also contain '/' or '\', which are already checked in is_suspicious_fallback_filename(). Consider either removing the helper, or broadening the Windows detection to catch drive-relative forms like C:foo/C:foo\\bar (which wouldn't be flagged by the current checks) so the extra logic is actually meaningful.

Suggested change
bytes.len() >= 3
&& bytes[0].is_ascii_alphabetic()
&& bytes[1] == b':'
&& (bytes[2] == b'\\' || bytes[2] == b'/')
bytes.len() >= 2
&& bytes[0].is_ascii_alphabetic()
&& bytes[1] == b':'

Copilot uses AI. Check for mistakes.
Comment on lines +147 to +149
/// Zero-based indexes of non-string entries in `project_doc_fallback_filenames`
pub project_doc_fallback_filename_non_string_indices: Vec<usize>,
/// Zero-based indexes of empty/whitespace-only entries in `project_doc_fallback_filenames`
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

project_doc_fallback_filename_*_indices are documented as zero-based, but the diagnostics intentionally display them as 1-based (idx + 1). This mismatch is easy to trip over in future refactors/tests; either store 1-based indices in ParsedCodexConfig or update the field docs to explicitly note that diagnostics are rendered 1-based.

Suggested change
/// Zero-based indexes of non-string entries in `project_doc_fallback_filenames`
pub project_doc_fallback_filename_non_string_indices: Vec<usize>,
/// Zero-based indexes of empty/whitespace-only entries in `project_doc_fallback_filenames`
/// Zero-based indexes of non-string entries in `project_doc_fallback_filenames`.
/// Note: these are stored as 0-based internally; user-facing diagnostics render them as 1-based (`idx + 1`).
pub project_doc_fallback_filename_non_string_indices: Vec<usize>,
/// Zero-based indexes of empty/whitespace-only entries in `project_doc_fallback_filenames`.
/// Note: these are stored as 0-based internally; user-facing diagnostics render them as 1-based (`idx + 1`).

Copilot uses AI. Check for mistakes.
Comment on lines +349 to +384
duplicates.sort();
duplicates.dedup();
for filename in duplicates {
diagnostics.push(
Diagnostic::warning(
path.to_path_buf(),
line,
0,
"CDX-006",
t!("rules.cdx_006.duplicate", value = filename.as_str()),
)
.with_suggestion(t!("rules.cdx_006.suggestion")),
);
}

let mut suspicious: Vec<String> = filenames
.iter()
.map(|name| name.trim())
.filter(|name| is_suspicious_fallback_filename(name))
.map(|name| name.to_string())
.collect();
suspicious.sort();
suspicious.dedup();

for filename in suspicious {
diagnostics.push(
Diagnostic::warning(
path.to_path_buf(),
line,
0,
"CDX-006",
t!("rules.cdx_006.suspicious", value = filename.as_str()),
)
.with_suggestion(t!("rules.cdx_006.suggestion")),
);
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CDX-006 is documented as a HIGH-severity rule with a MUST requirement for uniqueness/bare filenames, but duplicates and path-like entries currently emit Diagnostic::warning. Either update the rule documentation/severity to reflect that these are warnings, or upgrade these diagnostics to errors so runtime behavior matches the documented MUST semantics.

Copilot uses AI. Check for mistakes.
@avifenesh avifenesh merged commit 7e827ea into main Feb 26, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex: validate project_doc_fallback_filenames semantics in .codex/config.toml

2 participants