feat(codex,kiro): S-tier alignment - 38 new validation rules#632
feat(codex,kiro): S-tier alignment - 38 new validation rules#632
Conversation
) Add 16 Codex CLI rules (CDX-CFG-013 through CDX-CFG-022, CDX-AG-004 through CDX-AG-007, CDX-APP-002, CDX-APP-003) and 22 Kiro rules (KR-AG-008 through KR-AG-013, KR-HK-007 through KR-HK-010, KIRO-010 through KIRO-014, KR-PW-005 through KR-PW-008, KR-MCP-003 through KR-MCP-005). Total rules: 304 -> 342
Cover all new rules added in the S-tier alignment work: - codex.rs: CDX-CFG-013..022, CDX-AG-004..007, CDX-APP-002..003 - kiro_hook.rs: KR-HK-007, KR-HK-009, KR-HK-010 - kiro_mcp.rs: KR-MCP-003, KR-MCP-004, KR-MCP-005 - kiro_power.rs: KR-PW-005..008 - kiro_steering.rs: KIRO-010, KIRO-011, KIRO-013, KIRO-014 - kiro_agent.rs: KR-AG-008..013
…ts, document project-level rules
- Guard validate_codex_config_rules to only run for config files (TOML/JSON/YAML),
not markdown - avoids wasteful TOML/JSON parsing attempts on .md files
- Build key_line_map once in CodexValidator::validate() and pass to
validate_codex_config_rules, eliminating redundant double-build for TOML files
- Fix KR-HK-009 Windows path detection to verify first byte is alphabetic
- Add project-level rule comments for KR-HK-008 and KR-MCP-005 (matching
existing KIRO-012 and CDX-AG-007 patterns)
- Add edge case tests:
- KR-HK-007: timeoutMs key variant, boundary at 300000/300001
- KR-HK-008: metadata-only test
- KR-HK-009: Windows path (C:\lint.exe)
- KR-HK-010: api_key= and password= secret markers
- KR-AG-008/009: blank/whitespace-only name and prompt
- KR-AG-010: case-insensitive dedup (readFiles vs READFILES)
- KR-PW-006: case-insensitive duplicate keywords (foo vs Foo)
- KR-PW-008: template values ${VAR} not flagged as secrets
- KR-MCP-004: sse:// URL accepted
M1: Extract shared seems_plaintext_secret to rules/mod.rs with robust
logic (length >= 8, quote trimming, template exclusion). All 4 files
(kiro_steering, kiro_power, kiro_mcp, kiro_hook) now use it.
M2: Extract shared line_col_at_offset to rules/mod.rs. Both
kiro_steering and kiro_agent now import it.
M3: Broaden KR-HK-010 secret markers to include ghp_, gho_, xoxb-,
xoxp-, AKIA, AIZA, glpat-, sk-ant-, sk-proj-.
M4: Broaden KR-AG-013 regex to match ghp_, gho_, AKIA, xoxb-, glpat-,
sk-ant-, sk-proj- prefixes.
M5: Fixed by M1 (kiro_mcp now uses shared function with length check).
M6: Add word boundary awareness to CDX-AG-002 via contains_word helper.
Also add has_token_prefix for additional secret prefixes.
M7: Replace HashSet with direct .contains() on slices in
schemas/codex.rs detect_unknown_keys.
M8: Combine CDX-006 duplicate and suspicious checks into single pass.
M9: Use contains() check before insert in kiro_agent.rs and remove
unnecessary clone in kiro_power.rs.
M10-M17: Add missing tests for CDX-CFG negative, CDX-AG-002
interpolation, CDX-CFG-007 suppression, CDX-CFG-010 env-var,
KR-PW-005 end-of-body, KIRO-006 all templates, KR-MCP-001 parse
error, KR-MCP-002 env var exclusion.
L1-L2: Add quick pre-check before lowercase allocation in CDX-AG-002
and KR-HK-010.
L3: Add comment noting Vec collect is intentional for look-ahead.
L4: Add negative test for KR-PW-007 valid kebab-case.
L5: Add negative test for KR-AG-011 non-empty tools.
L6: Already addressed (keys created once at function top).
L7: Add ws:// and wss:// tests for KR-MCP-004.
L8: Add separator check after alphabetic + colon for Windows paths.
L9: Add comment documenting ## heading convention.
L10: Add comment noting sse:// is an MCP transport convention.
L11: Change fake secret to sk-test_PLACEHOLDER_NOT_REAL_KEY_000.
line.contains('$') was too broad - suppressed secret detection for any
line with a dollar sign (e.g., "costs $5"). Now only suppresses for
actual variable interpolation patterns.
- SPEC.md: update rule count 286 -> 342 (2 locations) - CHANGELOG.md: add entry for 38 new Kiro+Codex rules - cargo fmt formatting pass
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant expansion of validation rules for Codex and Kiro, bringing them to a comparable level of validation depth with other S-tier platforms. The changes include deeper checks for configuration files, agent lifecycles, hook safety, and steering quality, along with new utilities and extensive unit tests to ensure reliability and security. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR adds 38 new validation rules (16 Codex CLI + 22 Kiro) to bring S-tier platforms to comparable validation depth, increasing total rules from 304 to 342. It extracts shared utilities (seems_plaintext_secret and line_col_at_offset) to rules/mod.rs, updates rule counts across all documentation files, and includes 64+ new unit tests.
Changes:
- Added 22 Kiro rules across steering (KIRO-010..014), agents (KR-AG-008..013), hooks (KR-HK-007..010), MCP (KR-MCP-003..005), and powers (KR-PW-005..008) validators, plus 16 Codex rules for config validation (CDX-CFG-013..022), AGENTS.md quality (CDX-AG-004..007), and app config (CDX-APP-002..003).
- Extracted
seems_plaintext_secretandline_col_at_offsethelper functions from individual validators torules/mod.rsas shared utilities, and replaced HashSet with slice.contains()in Codex schema parsing. - Updated rule counts, documentation pages, and generated rule reference files across website, knowledge-base, README, SPEC, CHANGELOG, CLAUDE.md, and AGENTS.md.
Reviewed changes
Copilot reviewed 22 out of 80 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
crates/agnix-core/src/rules/mod.rs |
Extracted shared seems_plaintext_secret and line_col_at_offset utilities |
crates/agnix-core/src/rules/kiro_steering.rs |
Added KIRO-010..014 rules, removed local helper functions in favor of shared ones |
crates/agnix-core/src/rules/kiro_power.rs |
Added KR-PW-005..008 rules (step desc, dup keywords, name format, secrets) |
crates/agnix-core/src/rules/kiro_mcp.rs |
Added KR-MCP-003..005 rules (missing args, invalid URL, duplicate names) |
crates/agnix-core/src/rules/kiro_hook.rs |
Added KR-HK-007..010 rules (timeout, duplicates, abs path, secrets) |
crates/agnix-core/src/rules/kiro_agent.rs |
Added KR-AG-008..013 rules (missing name/prompt, dup tools, empty tools, aliases, secrets) |
crates/agnix-core/src/schemas/codex.rs |
Added new Codex constants and replaced HashSet with slice .contains() |
crates/agnix-cli/locales/en.yml |
Added i18n strings for all 38 new rules |
crates/agnix-cli/tests/kiro_fixture_inventory.rs |
Updated expected diagnostic count for kiro-agents fixtures |
website/docs/rules/index.md |
Updated rules table with 38 new rule entries |
website/docs/rules/generated/*.md |
38 new generated rule documentation pages + "Opencode" → "OpenCode" fixes |
website/src/data/siteData.json |
Updated totalRules to 342 |
knowledge-base/VALIDATION-RULES.md |
Added rule specifications and updated summary tables |
knowledge-base/INDEX.md |
Updated rule counts in summary tables |
knowledge-base/README.md |
Updated rule count reference |
README.md, CLAUDE.md, AGENTS.md, SPEC.md |
Updated rule counts |
CHANGELOG.md |
Added changelog entry for 38 new rules |
editors/vscode/README.md, crates/agnix-lsp/README.md |
Updated rule count references |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .or_else(|| hook.extra.get("timeoutMs")) | ||
| { | ||
| if let Some(timeout) = timeout_val.as_u64() { | ||
| if timeout > MAX_HOOK_TIMEOUT_MS { |
There was a problem hiding this comment.
KR-HK-007 does not flag timeout: 0. The u64 path only checks timeout > MAX_HOOK_TIMEOUT_MS, but the rule spec says the timeout should be between 1 and 300,000ms. Since 0 is parsed as u64 by serde_json (not i64), the as_i64() branch that catches <= 0 is never reached for zero. The u64 condition should be timeout == 0 || timeout > MAX_HOOK_TIMEOUT_MS to also catch zero timeouts.
| if timeout > MAX_HOOK_TIMEOUT_MS { | |
| if timeout == 0 || timeout > MAX_HOOK_TIMEOUT_MS { |
There was a problem hiding this comment.
Fixed in dd582f5 - added timeout == 0 check.
knowledge-base/VALIDATION-RULES.md
Outdated
| **Total Coverage**: 342 validation rules across 36 categories | ||
|
|
||
| **Knowledge Base**: 11,036 lines, 320KB, 75+ sources | ||
| **Certainty**: 180 HIGH, 115 MEDIUM, 9 LOW |
There was a problem hiding this comment.
The "Certainty" line at the bottom of the file has not been updated to match the summary table. The table (line 2827) shows "190 HIGH, 138 MEDIUM, 14 LOW" but this line still says "180 HIGH, 115 MEDIUM, 9 LOW". These should be consistent.
There was a problem hiding this comment.
Fixed in dd582f5 - updated to 190 HIGH, 138 MEDIUM, 14 LOW.
- KR-HK-007: also flag timeout=0 (was only checking > max) - VALIDATION-RULES.md: update certainty line to 190 HIGH, 138 MEDIUM, 14 LOW
There was a problem hiding this comment.
Code Review
This pull request introduces 38 new validation rules for Codex and Kiro, significantly expanding the linter's capabilities. The changes include adding new rule definitions, implementing their logic in the respective validator modules, and updating documentation and tests accordingly. The code quality is generally high, with good use of Rust's features and a clear structure. However, I've identified a few areas for improvement, mainly around consistency in how secrets are detected across different validators. By centralizing the secret detection logic, we can reduce code duplication and improve the accuracy of these important security checks. Overall, this is a substantial and valuable contribution to the project.
Note: Security Review did not run due to the size of the PR.
| // KR-AG-013: Secrets in agent prompt | ||
| if config.is_rule_enabled("KR-AG-013") | ||
| && let Some(prompt) = obj.get("prompt").and_then(Value::as_str) | ||
| { | ||
| let secret_re = agent_secret_pattern(); | ||
| if secret_re.is_match(prompt) { | ||
| diagnostics.push( | ||
| Diagnostic::error( | ||
| path.to_path_buf(), | ||
| 1, | ||
| 0, | ||
| "KR-AG-013", | ||
| t!("rules.kr_ag_013.message"), | ||
| ) | ||
| .with_suggestion(t!("rules.kr_ag_013.suggestion")), | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation for KR-AG-013 uses a regex that might not correctly handle cases where secrets are passed via environment variables (e.g., API_KEY=${MY_SECRET}). This could lead to false positives. A more robust approach would be to use a regex that captures the potential secret value and then use the seems_plaintext_secret helper function to check if it's a hardcoded secret, similar to how KR-PW-008 is implemented. This would improve consistency and accuracy across the secret detection rules.
There was a problem hiding this comment.
Fixed in 8328c8f - added env: prefix to shared helper, improved KR-HK-010 pattern matching, and confirmed KR-AG-013 regex naturally excludes env vars. Added negative test.
| // KR-HK-010: Secrets in hook command | ||
| if config.is_rule_enabled("KR-HK-010") { | ||
| if let Some(cmd) = hook.effective_run_command() { | ||
| // Quick pre-check: only lowercase if cmd might contain a marker. | ||
| // All markers contain at least one of these chars (case-insensitive). | ||
| let might_have_marker = cmd.bytes().any(|b| { | ||
| matches!( | ||
| b.to_ascii_lowercase(), | ||
| b's' | b'a' | b't' | b'p' | b'b' | b'g' | b'x' | ||
| ) | ||
| }); | ||
| let has_secret_marker = might_have_marker && { | ||
| let lower = cmd.to_ascii_lowercase(); | ||
| [ | ||
| "sk-", | ||
| "sk-ant-", | ||
| "sk-proj-", | ||
| "api_key=", | ||
| "apikey=", | ||
| "token=", | ||
| "password=", | ||
| "bearer ", | ||
| "ghp_", | ||
| "gho_", | ||
| "xoxb-", | ||
| "xoxp-", | ||
| "akia", | ||
| "aiza", | ||
| "glpat-", | ||
| ] | ||
| .iter() | ||
| .any(|marker| lower.contains(marker)) | ||
| }; | ||
| if has_secret_marker { | ||
| diagnostics.push( | ||
| Diagnostic::error( | ||
| path.to_path_buf(), | ||
| 1, | ||
| 0, | ||
| "KR-HK-010", | ||
| t!("rules.kr_hk_010.message"), | ||
| ) | ||
| .with_suggestion(t!("rules.kr_hk_010.suggestion")), | ||
| ); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The secret detection logic for KR-HK-010 is based on a simple substring search, which is prone to false positives (e.g., it could flag the word 'token' in a regular sentence). It also doesn't account for secrets passed as environment variables. To improve accuracy and align with other secret detection rules, consider using a regular expression to identify key-value patterns and then applying the seems_plaintext_secret helper function to the value. This approach is used in kiro_power.rs for KR-PW-008 and would be a good model to follow here.
There was a problem hiding this comment.
Fixed in 8328c8f - restructured to use key=value patterns + seems_plaintext_secret for bearer.
| pub(crate) fn seems_plaintext_secret(value: &str) -> bool { | ||
| let trimmed = value.trim_matches(|ch| ch == '"' || ch == '\'').trim(); | ||
| !trimmed.is_empty() | ||
| && !trimmed.starts_with("${") | ||
| && !trimmed.starts_with("$(") | ||
| && !trimmed.starts_with("{{") | ||
| && !trimmed.starts_with('<') | ||
| && trimmed.len() >= 8 | ||
| } |
There was a problem hiding this comment.
It's great that you've centralized the seems_plaintext_secret function. However, the implementation in codex.rs (seems_hardcoded_secret) includes a check for env:, which is missing here. To ensure consistent behavior across all secret detection rules, it would be beneficial to consolidate all such logic into this single function. Consider adding the env: check to this shared function.
There was a problem hiding this comment.
Fixed in 8328c8f - added env: prefix exclusion to shared seems_plaintext_secret.
- mod.rs: add env: prefix exclusion to shared seems_plaintext_secret
- kiro_hook.rs: KR-HK-010 now uses key=value patterns for assignment-style
markers and applies seems_plaintext_secret for bearer values
- kiro_agent.rs: KR-AG-013 documented that regex naturally excludes env vars;
added negative test for ${VAR} and $(cmd) patterns
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 80 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| | Kiro Steering | 9 | 3 | 6 | 0 | 1 | | ||
| | **TOTAL** | **286** | **170** | **107** | **9** | **96** | | ||
| | Kiro Steering | 14 | 3 | 9 | 2 | 1 | | ||
| | **TOTAL** | **342** | **190** | **138** | **14** | **96** | |
There was a problem hiding this comment.
The Auto-Fix column for OpenCode in this table shows 2, but the corresponding table in knowledge-base/VALIDATION-RULES.md (line 2801) shows 8 auto-fixable OpenCode rules. This propagates to the TOTAL row: this table says 96 auto-fixable rules while VALIDATION-RULES.md says 102. The OpenCode Auto-Fix count should be 8 and the total should be 102 to match VALIDATION-RULES.md (and the siteData.json autofixCount of 102).
| | **TOTAL** | **342** | **190** | **138** | **14** | **96** | | |
| | **TOTAL** | **342** | **190** | **138** | **14** | **102** | |
Summary
Adds 38 new validation rules (16 Codex + 22 Kiro) to bring all S-tier platforms to comparable validation depth.
Highlights
seems_plaintext_secretandline_col_at_offsetutilities extracted to rules/mod.rsTest Plan
cargo test --workspace- 4130+ tests passcargo clippy -- -D warnings- cleancargo fmt -- --check- cleanpython3 scripts/check-rule-counts.py- parity verifiedCloses #603