Skip to content

feat(policy): propose non-Rust allowlist entries (#205, rollout PR 6/12)#218

Merged
EffortlessSteven merged 1 commit into
mainfrom
feat/xtask-non-rust-propose-20260511
May 11, 2026
Merged

feat(policy): propose non-Rust allowlist entries (#205, rollout PR 6/12)#218
EffortlessSteven merged 1 commit into
mainfrom
feat/xtask-non-rust-propose-20260511

Conversation

@EffortlessSteven

Copy link
Copy Markdown
Member

Summary

Sixth PR in the 12-PR file-policy rollout. Adds cargo xtask non-rust propose — finds tracked non-Rust files not yet receipted in policy/non-rust-allowlist.toml and writes draft entries to target/policy/. Never mutates the real ledger.

Issue

Closes #205. Depends on #202 (ledger, merged), #212 (xtask scaffold, merged), #204 (file-policy checker, merged). Refines #180. Tracks #109.

Behavior

  • Walks git ls-files -z minus *.rs (same source as inventory + check-file-policy).
  • Loads policy/non-rust-allowlist.toml's [[file]] and [[glob]] selectors.
  • The remainder is the unreceipted set.
  • For each unreceipted path, emits one [[file]] entry with TODO fields and created = today, review_after = today + 90d.
  • Writes:
    • target/policy/non-rust-proposed-allowlist.toml — draft TOML.
    • target/policy/non-rust-proposal.md — review workflow + entries grouped by top-level directory.

Decisions

  • Default review_after window: 90 days. Matches the hand-seeded receipts in chore(policy): add non-Rust policy allowlist ledgers (#202, rollout PR 2/12) #214. Reviewer can adjust per entry.
  • TOML rendered by hand. Keeps comment blocks at the top and the entry order deterministic. Reviewers see the schema explanation inline.
  • Duplicated allowlist deserialization + tracked-file enumeration from check_file_policy.rs. Each helper is ~30 lines. A DRY refactor is a small follow-up rather than scope creep here.

Live result on this branch

wrote 945 proposed entries to target/policy/non-rust-proposed-allowlist.toml (and matching non-rust-proposal.md)

945 matches the unreceipted count check-file-policy --mode advisory surfaced in #217. The proposal is broken down by top-level dir for scannability — .github and crates dominate.

Acceptance

  • cargo check --workspace --locked passes.
  • cargo clippy -p xtask --all-targets --locked -- -D warnings clean.
  • cargo fmt --all -- --check clean.
  • cargo xtask non-rust propose produces both artifacts.
  • Generated TOML parses as valid TOML.
  • policy/non-rust-allowlist.toml is not touched by git status after the run.

Follow-ups

Sixth PR in the 12-PR file-policy rollout. Adds
`cargo xtask non-rust propose`, which finds tracked non-Rust files
that are not yet covered by `policy/non-rust-allowlist.toml` and
writes draft entries to `target/policy/`:

  target/policy/non-rust-proposed-allowlist.toml
  target/policy/non-rust-proposal.md

The proposer NEVER mutates `policy/non-rust-allowlist.toml`. The
draft TOML is a starting point; a human (or follow-up agent) reviews,
edits, and intentionally copies entries into the real ledger.

## Behavior

- Walks `git ls-files -z` minus `*.rs` for the tracked non-Rust set.
- Loads `policy/non-rust-allowlist.toml` and the `[[file]]` / `[[glob]]`
  selectors.
- Subtracts the covered set; the remainder is the unreceipted list.
- Writes one `[[file]]` entry per unreceipted path with TODO fields:

      kind = "TODO"
      surface = "TODO"
      classification = "TODO"
      owner = "TODO"
      reason = "TODO: explain why this non-Rust surface remains. If
                scheduled for conversion to Rust/xtask, say so and add
                an `expires` date."
      created = "<today>"
      review_after = "<today + 90 days>"

- Writes a Markdown proposal that groups entries by top-level directory
  and explains the review workflow.

## Decisions

- **Default review window: 90 days.** Honest enough for a first-pass
  receipt; a reviewer can shorten or extend per entry. Matches what the
  hand-seeded receipts in #214 used.
- **TOML rendered by hand, not via `toml::to_string`.** Keeps comment
  blocks at the top and the order deterministic. The proposed-allowlist
  schema is documented inline so reviewers see the explanation in the
  file itself.
- **Duplicated allowlist deserialization + tracked-file enumeration
  from check_file_policy.rs.** The shared helpers fit in ~30 lines
  each; a DRY refactor is a small follow-up PR rather than scope creep
  here.

## Acceptance

- `cargo check --workspace --locked` passes.
- `cargo clippy -p xtask --all-targets --locked -- -D warnings` clean.
- `cargo fmt --all -- --check` clean.
- `cargo xtask non-rust propose` ran on this branch:
    `wrote 945 proposed entries to target/policy/non-rust-proposed-
     allowlist.toml (and matching non-rust-proposal.md)`
- Both artifacts produced; the TOML parses as valid TOML.
- `policy/non-rust-allowlist.toml` is unchanged (proposer never
  touches the real ledger).

Closes #205.
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown

Review Change Stack

Summary by CodeRabbit

  • New Features
    • Added cargo xtask non-rust propose command that scans workspace non-Rust files and generates a policy proposal with review checklists for updating the non-Rust allowlist.

Walkthrough

This PR introduces cargo xtask non-rust propose, a CLI subcommand that discovers all tracked non-Rust files, loads the existing allowlist, identifies unreceipted files, and generates draft TOML and markdown proposal documents in target/policy/ for human review.

Changes

Non-Rust Allowlist Proposal

Layer / File(s) Summary
CLI Entry Point and Dispatch
xtask/src/main.rs
Declares propose module, registers NonRustCommand::Propose enum variant with CLI help text, and dispatches to propose::propose()? in main.
Proposal Orchestration
xtask/src/propose.rs
Main propose() entry point that coordinates file discovery, allowlist loading, unreceipted detection, output directory creation, and writes both TOML and markdown proposal files.
File Discovery and Workspace Root
xtask/src/propose.rs
Identifies workspace root, enumerates git-tracked non-Rust files via git ls-files -z, filters .rs files, and sorts the results.
Allowlist Schema and Loading
xtask/src/propose.rs
Defines AllowlistDoc and RawEntry TOML schema structures; load_allowlist() parses the existing allowlist into exact-path and glob-pattern selectors.
Selector Matching and Unreceipted Detection
xtask/src/propose.rs
Selector enum and find_unreceipted() logic match candidate paths against exact paths and compiled glob patterns to identify files not covered by the allowlist.
Date and Timestamp Utilities
xtask/src/propose.rs
today_iso() returns current UTC date as YYYY-MM-DD; review_after_iso() computes 90-day future date with fallback on error.
Output Rendering and Escaping
xtask/src/propose.rs
render_toml() generates proposed allowlist TOML with TODO placeholders and metadata; render_markdown() generates human-readable proposal document with usage instructions and directory-grouped entries; toml_escape() escapes paths for safe TOML inclusion.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A propose command hops into view,
Tracking files both old and new,
Allowlists matched with care,
Draft entries everywhere,
TODO placeholders—your move! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding a cargo xtask non-rust propose command for proposing non-Rust allowlist entries, and correctly references the related issue #205.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering behavior, decisions, acceptance criteria, and follow-ups relevant to the new propose command.
Linked Issues check ✅ Passed The PR fully implements the objectives from issue #205: enumerates non-Rust files via git ls-files, loads the allowlist, identifies unreceipted files, generates draft entries with TODO placeholders and 90-day review window, and writes only to target/policy/ without mutating the real ledger.
Out of Scope Changes check ✅ Passed All changes are in scope: xtask/src/main.rs registers the new propose subcommand, and xtask/src/propose.rs implements the required functionality with no extraneous modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/xtask-non-rust-propose-20260511

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@factory-droid

factory-droid Bot commented May 11, 2026

Copy link
Copy Markdown

Droid finished @EffortlessSteven's task —— View job


Droid validation complete for PR #218.

Review summary:
Validated one P1 actionable finding: the render_markdown function passes {PROPOSED_TOML} in a plain string to push_str, which does not perform format interpolation, causing a broken Markdown link in the generated proposal file. The suggested one-line fix is correct. No other candidates require rejection.

Findings:

  • [P1] xtask/src/propose.rs:232 — {PROPOSED_TOML} literal in push_str instead of interpolated value; broken Markdown link in generated proposal file

Validation signal:
Observed: diff confirms {PROPOSED_TOML} is a const string used literally in render_markdown, not passed through format!
Reported: PR description confirms the command produces artifacts in target/policy/
Not verified: whether the generated .md was inspected end-to-end by the author before submission

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@xtask/src/propose.rs`:
- Around line 207-213: The loop that generates [[file]] entries currently writes
kind = "TODO" for every path; change it to compute a kind by inspecting the file
extension (e.g., call or implement an extension heuristic helper like
infer_kind_from_extension(path) or match on Path::extension()) and write that
value instead of the hardcoded "TODO"; update the block around the for path in
unreceipted loop (the out.push_str calls that write kind = ...) to use the
computed kind string (escaped/quoted like the other fields) so generated entries
are prefilled by extension-based heuristics.
- Around line 98-101: Change the allowlist loading to validate and compile glob
patterns once in load_allowlist instead of storing raw strings: in the loop that
currently pushes Selector::Pattern(p) (the code around load_allowlist/for raw in
doc.glob), call Glob::new(&p).with_context(|| format!("invalid glob pattern in
allowlist: {}", p))? and push a Selector::Pattern with the resulting GlobMatcher
(adjust the Selector::Pattern variant to hold a GlobMatcher instead of String).
Update any users of Selector::Pattern (e.g., any_match()) to use the precompiled
matcher rather than re-compiling, so invalid patterns surface as errors at load
time rather than being treated as no-match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c78a0828-3dd1-414f-8c0e-4bcc71486f09

📥 Commits

Reviewing files that changed from the base of the PR and between e660fc0 and 93a1fcb.

📒 Files selected for processing (2)
  • xtask/src/main.rs
  • xtask/src/propose.rs

Comment thread xtask/src/propose.rs
Comment on lines +98 to +101
for raw in doc.glob {
if let Some(p) = raw.pattern {
entries.push(Selector::Pattern(p));
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In the Rust globsetcrate, doesGlob::new(...)return an error for malformed patterns, and wouldGlob::new(pat).map(...).unwrap_or(false) effectively suppress that error and treat it as a non-match?

💡 Result:

Yes. In globset, Glob::new(&str) returns a Result, so it can fail when parsing a glob pattern (e.g., malformed patterns). The docs show Glob::new returns Result<Glob, Error> [1]. The source confirms Glob::new(glob: &str) -> Result<Glob, Error> and delegates to GlobBuilder::build [2]. Given that, Glob::new(pat).map(...).unwrap_or(false) does suppress the error, because unwrap_or(false) converts an Err into the provided default value false. In other words, a malformed pattern causes the Result to be Err, and the expression evaluates to false rather than propagating/panicking. Note: this only covers errors from Glob::new(pat). If you later do compile_matcher or matching, those are separate operations (and compile_matcher is not fallible per the API) [1].

Citations:


🏁 Script executed:

# First, let's locate and examine the file
find . -name "propose.rs" -type f

Repository: EffortlessMetrics/shipper

Length of output: 92


🏁 Script executed:

# Read the propose.rs file to see the code structure
cat -n ./xtask/src/propose.rs | head -150

Repository: EffortlessMetrics/shipper

Length of output: 5999


🏁 Script executed:

# Check for all uses of the Selector enum
rg "Selector::" ./xtask/src/propose.rs -n

Repository: EffortlessMetrics/shipper

Length of output: 266


🏁 Script executed:

# Check what's imported from globset
rg "use globset" ./xtask/src/propose.rs -n

Repository: EffortlessMetrics/shipper

Length of output: 91


Compile glob selectors at load time; fail early on invalid patterns instead of silently treating them as non-matches.

Malformed glob patterns in the allowlist are currently stored as strings and re-compiled on every any_match() call via Glob::new(pat).map(...).unwrap_or(false). This suppresses compilation errors, silently treating invalid patterns as no-match, which can incorrectly inflate the unreceipted set.

Validate and compile globs once in load_allowlist (lines 98–102), storing GlobMatcher instead of String in the Selector::Pattern variant. Add error context via .with_context() to surface which pattern is malformed.

Suggested fix
-use globset::Glob;
+use globset::{Glob, GlobMatcher};
@@
 enum Selector {
     Path(String),
-    Pattern(String),
+    Pattern(GlobMatcher),
 }
@@
     for raw in doc.glob {
         if let Some(p) = raw.pattern {
-            entries.push(Selector::Pattern(p));
+            let matcher = Glob::new(&p)
+                .with_context(|| format!("invalid glob pattern in {}: {}", path.display(), p))?
+                .compile_matcher();
+            entries.push(Selector::Pattern(matcher));
         }
     }
@@
 fn any_match(entries: &[Selector], path: &str) -> bool {
     entries.iter().any(|sel| match sel {
         Selector::Path(p) => p == path,
-        Selector::Pattern(pat) => Glob::new(pat)
-            .map(|g| g.compile_matcher().is_match(path))
-            .unwrap_or(false),
+        Selector::Pattern(matcher) => matcher.is_match(path),
     })
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@xtask/src/propose.rs` around lines 98 - 101, Change the allowlist loading to
validate and compile glob patterns once in load_allowlist instead of storing raw
strings: in the loop that currently pushes Selector::Pattern(p) (the code around
load_allowlist/for raw in doc.glob), call Glob::new(&p).with_context(||
format!("invalid glob pattern in allowlist: {}", p))? and push a
Selector::Pattern with the resulting GlobMatcher (adjust the Selector::Pattern
variant to hold a GlobMatcher instead of String). Update any users of
Selector::Pattern (e.g., any_match()) to use the precompiled matcher rather than
re-compiling, so invalid patterns surface as errors at load time rather than
being treated as no-match.

Comment thread xtask/src/propose.rs
Comment on lines +207 to +213
for path in unreceipted {
out.push_str("[[file]]\n");
out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));
out.push_str("kind = \"TODO\"\n");
out.push_str("surface = \"TODO\"\n");
out.push_str("classification = \"TODO\"\n");
out.push_str("owner = \"TODO\"\n");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Populate kind using extension heuristics in generated entries.

Each proposed [[file]] entry currently hardcodes kind = "TODO", which misses the proposal objective to prefill kind from file-extension heuristics and leaves most of the triage burden manual.

Suggested direction
 fn render_toml(unreceipted: &[String], today: &str, review_after: &str) -> String {
@@
     for path in unreceipted {
+        let kind = infer_kind(path);
         out.push_str("[[file]]\n");
         out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));
-        out.push_str("kind = \"TODO\"\n");
+        out.push_str(&format!("kind = \"{}\"\n", kind));
         out.push_str("surface = \"TODO\"\n");
@@
 }
+
+fn infer_kind(path: &str) -> &'static str {
+    let ext = Path::new(path).extension().and_then(|s| s.to_str());
+    match ext {
+        Some("md" | "rst" | "txt") => "documentation",
+        Some("toml" | "yaml" | "yml" | "json") => "configuration",
+        Some("sh" | "ps1" | "bat") => "script",
+        _ => "TODO",
+    }
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for path in unreceipted {
out.push_str("[[file]]\n");
out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));
out.push_str("kind = \"TODO\"\n");
out.push_str("surface = \"TODO\"\n");
out.push_str("classification = \"TODO\"\n");
out.push_str("owner = \"TODO\"\n");
for path in unreceipted {
let kind = infer_kind(path);
out.push_str("[[file]]\n");
out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));
out.push_str(&format!("kind = \"{}\"\n", kind));
out.push_str("surface = \"TODO\"\n");
out.push_str("classification = \"TODO\"\n");
out.push_str("owner = \"TODO\"\n");
}
fn infer_kind(path: &str) -> &'static str {
let ext = Path::new(path).extension().and_then(|s| s.to_str());
match ext {
Some("md" | "rst" | "txt") => "documentation",
Some("toml" | "yaml" | "yml" | "json") => "configuration",
Some("sh" | "ps1" | "bat") => "script",
_ => "TODO",
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@xtask/src/propose.rs` around lines 207 - 213, The loop that generates
[[file]] entries currently writes kind = "TODO" for every path; change it to
compute a kind by inspecting the file extension (e.g., call or implement an
extension heuristic helper like infer_kind_from_extension(path) or match on
Path::extension()) and write that value instead of the hardcoded "TODO"; update
the block around the for path in unreceipted loop (the out.push_str calls that
write kind = ...) to use the computed kind string (escaped/quoted like the other
fields) so generated entries are prefilled by extension-based heuristics.

Comment thread xtask/src/propose.rs
Comment on lines +227 to +232
out.push_str(&format!(
"Generated by `cargo xtask non-rust propose` on {}.\n\n",
today
));
out.push_str("## How to use\n\n");
out.push_str(&format!(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Malformed Markdown link — {PROPOSED_TOML} is a literal string, not a variable interpolation

The format argument passed to push_str uses {PROPOSED_TOML} inside the string literal, but push_str does not perform format! interpolation. This means the generated non-rust-proposal.md will contain the literal text {PROPOSED_TOML} instead of non-rust-proposed-allowlist.toml, producing a broken Markdown reference like [non-rust-proposed-allowlist.toml](./{PROPOSED_TOML}). This is a one-line fix: either remove the braces entirely (since the string is a literal, not a format string) or change push_str to push_str(&format!(...)) with proper {} interpolation.

Suggested change
out.push_str(&format!(
"Generated by `cargo xtask non-rust propose` on {}.\n\n",
today
));
out.push_str("## How to use\n\n");
out.push_str(&format!(
out.push_str("1. Review each draft entry in [`non-rust-proposed-allowlist.toml`](./non-rust-proposed-allowlist.toml).\n");

@EffortlessSteven EffortlessSteven merged commit 72ba257 into main May 11, 2026
25 checks passed
@EffortlessSteven EffortlessSteven deleted the feat/xtask-non-rust-propose-20260511 branch May 11, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(policy): propose non-Rust allowlist entries (rollout PR 6/12)

1 participant