feat(policy): propose non-Rust allowlist entries (#205, rollout PR 6/12) by EffortlessSteven · Pull Request #218 · EffortlessMetrics/shipper

EffortlessSteven · 2026-05-11T23:09:40Z

Summary

Sixth PR in the 12-PR file-policy rollout. Adds cargo xtask non-rust propose — finds tracked non-Rust files not yet receipted in policy/non-rust-allowlist.toml and writes draft entries to target/policy/. Never mutates the real ledger.

Issue

Closes #205. Depends on #202 (ledger, merged), #212 (xtask scaffold, merged), #204 (file-policy checker, merged). Refines #180. Tracks #109.

Behavior

Walks git ls-files -z minus *.rs (same source as inventory + check-file-policy).
Loads policy/non-rust-allowlist.toml's [[file]] and [[glob]] selectors.
The remainder is the unreceipted set.
For each unreceipted path, emits one [[file]] entry with TODO fields and created = today, review_after = today + 90d.
Writes:
- target/policy/non-rust-proposed-allowlist.toml — draft TOML.
- target/policy/non-rust-proposal.md — review workflow + entries grouped by top-level directory.

Decisions

Default review_after window: 90 days. Matches the hand-seeded receipts in chore(policy): add non-Rust policy allowlist ledgers (#202, rollout PR 2/12) #214. Reviewer can adjust per entry.
TOML rendered by hand. Keeps comment blocks at the top and the entry order deterministic. Reviewers see the schema explanation inline.
Duplicated allowlist deserialization + tracked-file enumeration from check_file_policy.rs. Each helper is ~30 lines. A DRY refactor is a small follow-up rather than scope creep here.

Live result on this branch

wrote 945 proposed entries to target/policy/non-rust-proposed-allowlist.toml (and matching non-rust-proposal.md)

945 matches the unreceipted count check-file-policy --mode advisory surfaced in #217. The proposal is broken down by top-level dir for scannability — .github and crates dominate.

Acceptance

cargo check --workspace --locked passes.
cargo clippy -p xtask --all-targets --locked -- -D warnings clean.
cargo fmt --all -- --check clean.
cargo xtask non-rust propose produces both artifacts.
Generated TOML parses as valid TOML.
policy/non-rust-allowlist.toml is not touched by git status after the run.

Follow-ups

PR 7 (feat(policy): check generated, executable, and dependency surfaces (rollout PR 7/12) #206) — check-generated, check-executable-files, check-dependency-surfaces.
PR 8 (feat(policy): check workflow, process, and network surfaces (rollout PR 8/12) #207) — check-workflow-surfaces, check-process-policy, check-network-policy.
PR 9 (feat(policy): add unified policy report (rollout PR 9/12) #208) — unified policy-report aggregates inventory + all checkers + this proposal status.
Tiny DRY follow-up: move shared workspace_root / enumerate_non_rust helpers into a common.rs module after this PR lands.

Sixth PR in the 12-PR file-policy rollout. Adds `cargo xtask non-rust propose`, which finds tracked non-Rust files that are not yet covered by `policy/non-rust-allowlist.toml` and writes draft entries to `target/policy/`: target/policy/non-rust-proposed-allowlist.toml target/policy/non-rust-proposal.md The proposer NEVER mutates `policy/non-rust-allowlist.toml`. The draft TOML is a starting point; a human (or follow-up agent) reviews, edits, and intentionally copies entries into the real ledger. ## Behavior - Walks `git ls-files -z` minus `*.rs` for the tracked non-Rust set. - Loads `policy/non-rust-allowlist.toml` and the `[[file]]` / `[[glob]]` selectors. - Subtracts the covered set; the remainder is the unreceipted list. - Writes one `[[file]]` entry per unreceipted path with TODO fields: kind = "TODO" surface = "TODO" classification = "TODO" owner = "TODO" reason = "TODO: explain why this non-Rust surface remains. If scheduled for conversion to Rust/xtask, say so and add an `expires` date." created = "<today>" review_after = "<today + 90 days>" - Writes a Markdown proposal that groups entries by top-level directory and explains the review workflow. ## Decisions - **Default review window: 90 days.** Honest enough for a first-pass receipt; a reviewer can shorten or extend per entry. Matches what the hand-seeded receipts in #214 used. - **TOML rendered by hand, not via `toml::to_string`.** Keeps comment blocks at the top and the order deterministic. The proposed-allowlist schema is documented inline so reviewers see the explanation in the file itself. - **Duplicated allowlist deserialization + tracked-file enumeration from check_file_policy.rs.** The shared helpers fit in ~30 lines each; a DRY refactor is a small follow-up PR rather than scope creep here. ## Acceptance - `cargo check --workspace --locked` passes. - `cargo clippy -p xtask --all-targets --locked -- -D warnings` clean. - `cargo fmt --all -- --check` clean. - `cargo xtask non-rust propose` ran on this branch: `wrote 945 proposed entries to target/policy/non-rust-proposed- allowlist.toml (and matching non-rust-proposal.md)` - Both artifacts produced; the TOML parses as valid TOML. - `policy/non-rust-allowlist.toml` is unchanged (proposer never touches the real ledger). Closes #205.

gemini-code-assist · 2026-05-11T23:09:43Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

chatgpt-codex-connector · 2026-05-11T23:09:45Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

coderabbitai · 2026-05-11T23:09:52Z

Summary by CodeRabbit

New Features
- Added cargo xtask non-rust propose command that scans workspace non-Rust files and generates a policy proposal with review checklists for updating the non-Rust allowlist.

Walkthrough

This PR introduces cargo xtask non-rust propose, a CLI subcommand that discovers all tracked non-Rust files, loads the existing allowlist, identifies unreceipted files, and generates draft TOML and markdown proposal documents in target/policy/ for human review.

Changes

Non-Rust Allowlist Proposal

Layer / File(s)	Summary
CLI Entry Point and Dispatch `xtask/src/main.rs`	Declares `propose` module, registers `NonRustCommand::Propose` enum variant with CLI help text, and dispatches to `propose::propose()?` in main.
Proposal Orchestration `xtask/src/propose.rs`	Main `propose()` entry point that coordinates file discovery, allowlist loading, unreceipted detection, output directory creation, and writes both TOML and markdown proposal files.
File Discovery and Workspace Root `xtask/src/propose.rs`	Identifies workspace root, enumerates git-tracked non-Rust files via `git ls-files -z`, filters `.rs` files, and sorts the results.
Allowlist Schema and Loading `xtask/src/propose.rs`	Defines `AllowlistDoc` and `RawEntry` TOML schema structures; `load_allowlist()` parses the existing allowlist into exact-path and glob-pattern selectors.
Selector Matching and Unreceipted Detection `xtask/src/propose.rs`	`Selector` enum and `find_unreceipted()` logic match candidate paths against exact paths and compiled glob patterns to identify files not covered by the allowlist.
Date and Timestamp Utilities `xtask/src/propose.rs`	`today_iso()` returns current UTC date as `YYYY-MM-DD`; `review_after_iso()` computes 90-day future date with fallback on error.
Output Rendering and Escaping `xtask/src/propose.rs`	`render_toml()` generates proposed allowlist TOML with TODO placeholders and metadata; `render_markdown()` generates human-readable proposal document with usage instructions and directory-grouped entries; `toml_escape()` escapes paths for safe TOML inclusion.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A propose command hops into view,
Tracking files both old and new,
Allowlists matched with care,
Draft entries everywhere,
TODO placeholders—your move! 🎯

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding a `cargo xtask non-rust propose` command for proposing non-Rust allowlist entries, and correctly references the related issue `#205`.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, covering behavior, decisions, acceptance criteria, and follow-ups relevant to the new propose command.
Linked Issues check	✅ Passed	The PR fully implements the objectives from issue `#205`: enumerates non-Rust files via git ls-files, loads the allowlist, identifies unreceipted files, generates draft entries with TODO placeholders and 90-day review window, and writes only to target/policy/ without mutating the real ledger.
Out of Scope Changes check	✅ Passed	All changes are in scope: xtask/src/main.rs registers the new propose subcommand, and xtask/src/propose.rs implements the required functionality with no extraneous modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/xtask-non-rust-propose-20260511

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

factory-droid · 2026-05-11T23:09:55Z

Droid finished @EffortlessSteven's task —— View job

Droid validation complete for PR #218.

Review summary:
Validated one P1 actionable finding: the render_markdown function passes {PROPOSED_TOML} in a plain string to push_str, which does not perform format interpolation, causing a broken Markdown link in the generated proposal file. The suggested one-line fix is correct. No other candidates require rejection.

Findings:

[P1] xtask/src/propose.rs:232 — {PROPOSED_TOML} literal in push_str instead of interpolated value; broken Markdown link in generated proposal file

Validation signal:
Observed: diff confirms {PROPOSED_TOML} is a const string used literally in render_markdown, not passed through format!
Reported: PR description confirms the command produces artifacts in target/policy/
Not verified: whether the generated .md was inspected end-to-end by the author before submission

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@xtask/src/propose.rs`:
- Around line 207-213: The loop that generates [[file]] entries currently writes
kind = "TODO" for every path; change it to compute a kind by inspecting the file
extension (e.g., call or implement an extension heuristic helper like
infer_kind_from_extension(path) or match on Path::extension()) and write that
value instead of the hardcoded "TODO"; update the block around the for path in
unreceipted loop (the out.push_str calls that write kind = ...) to use the
computed kind string (escaped/quoted like the other fields) so generated entries
are prefilled by extension-based heuristics.
- Around line 98-101: Change the allowlist loading to validate and compile glob
patterns once in load_allowlist instead of storing raw strings: in the loop that
currently pushes Selector::Pattern(p) (the code around load_allowlist/for raw in
doc.glob), call Glob::new(&p).with_context(|| format!("invalid glob pattern in
allowlist: {}", p))? and push a Selector::Pattern with the resulting GlobMatcher
(adjust the Selector::Pattern variant to hold a GlobMatcher instead of String).
Update any users of Selector::Pattern (e.g., any_match()) to use the precompiled
matcher rather than re-compiling, so invalid patterns surface as errors at load
time rather than being treated as no-match.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c78a0828-3dd1-414f-8c0e-4bcc71486f09

📥 Commits

Reviewing files that changed from the base of the PR and between e660fc0 and 93a1fcb.

📒 Files selected for processing (2)

xtask/src/main.rs
xtask/src/propose.rs

coderabbitai · 2026-05-11T23:13:24Z

+    for raw in doc.glob {
+        if let Some(p) = raw.pattern {
+            entries.push(Selector::Pattern(p));
+        }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In the Rust globsetcrate, doesGlob::new(...)return an error for malformed patterns, and wouldGlob::new(pat).map(...).unwrap_or(false) effectively suppress that error and treat it as a non-match?

💡 Result:

Yes. In globset, Glob::new(&str) returns a Result, so it can fail when parsing a glob pattern (e.g., malformed patterns). The docs show Glob::new returns Result<Glob, Error> [1]. The source confirms Glob::new(glob: &str) -> Result<Glob, Error> and delegates to GlobBuilder::build [2]. Given that, Glob::new(pat).map(...).unwrap_or(false) does suppress the error, because unwrap_or(false) converts an Err into the provided default value false. In other words, a malformed pattern causes the Result to be Err, and the expression evaluates to false rather than propagating/panicking. Note: this only covers errors from Glob::new(pat). If you later do compile_matcher or matching, those are separate operations (and compile_matcher is not fallible per the API) [1].

Citations:

1: https://docs.rs/globset/latest/globset/struct.Glob.html

2: https://docs.rs/globset/0.4.6/src/globset/glob.rs.html

🏁 Script executed:

# First, let's locate and examine the file find . -name "propose.rs" -type f

Repository: EffortlessMetrics/shipper

Length of output: 92

🏁 Script executed:

# Read the propose.rs file to see the code structure cat -n ./xtask/src/propose.rs | head -150

Repository: EffortlessMetrics/shipper

Length of output: 5999

🏁 Script executed:

# Check for all uses of the Selector enum rg "Selector::" ./xtask/src/propose.rs -n

Repository: EffortlessMetrics/shipper

Length of output: 266

🏁 Script executed:

# Check what's imported from globset rg "use globset" ./xtask/src/propose.rs -n

Repository: EffortlessMetrics/shipper

Length of output: 91

Compile glob selectors at load time; fail early on invalid patterns instead of silently treating them as non-matches.

Malformed glob patterns in the allowlist are currently stored as strings and re-compiled on every any_match() call via Glob::new(pat).map(...).unwrap_or(false). This suppresses compilation errors, silently treating invalid patterns as no-match, which can incorrectly inflate the unreceipted set.

Validate and compile globs once in load_allowlist (lines 98–102), storing GlobMatcher instead of String in the Selector::Pattern variant. Add error context via .with_context() to surface which pattern is malformed.

Suggested fix

-use globset::Glob; +use globset::{Glob, GlobMatcher}; @@ enum Selector { Path(String), - Pattern(String), + Pattern(GlobMatcher), } @@ for raw in doc.glob { if let Some(p) = raw.pattern { - entries.push(Selector::Pattern(p)); + let matcher = Glob::new(&p) + .with_context(|| format!("invalid glob pattern in {}: {}", path.display(), p))? + .compile_matcher(); + entries.push(Selector::Pattern(matcher)); } } @@ fn any_match(entries: &[Selector], path: &str) -> bool { entries.iter().any(|sel| match sel { Selector::Path(p) => p == path, - Selector::Pattern(pat) => Glob::new(pat) - .map(|g| g.compile_matcher().is_match(path)) - .unwrap_or(false), + Selector::Pattern(matcher) => matcher.is_match(path), }) }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@xtask/src/propose.rs` around lines 98 - 101, Change the allowlist loading to validate and compile glob patterns once in load_allowlist instead of storing raw strings: in the loop that currently pushes Selector::Pattern(p) (the code around load_allowlist/for raw in doc.glob), call Glob::new(&p).with_context(|| format!("invalid glob pattern in allowlist: {}", p))? and push a Selector::Pattern with the resulting GlobMatcher (adjust the Selector::Pattern variant to hold a GlobMatcher instead of String). Update any users of Selector::Pattern (e.g., any_match()) to use the precompiled matcher rather than re-compiling, so invalid patterns surface as errors at load time rather than being treated as no-match.

coderabbitai · 2026-05-11T23:13:24Z

+    for path in unreceipted {
+        out.push_str("[[file]]\n");
+        out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));
+        out.push_str("kind = \"TODO\"\n");
+        out.push_str("surface = \"TODO\"\n");
+        out.push_str("classification = \"TODO\"\n");
+        out.push_str("owner = \"TODO\"\n");


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Populate kind using extension heuristics in generated entries.

Each proposed [[file]] entry currently hardcodes kind = "TODO", which misses the proposal objective to prefill kind from file-extension heuristics and leaves most of the triage burden manual.

Suggested direction

fn render_toml(unreceipted: &[String], today: &str, review_after: &str) -> String { @@ for path in unreceipted { + let kind = infer_kind(path); out.push_str("[[file]]\n"); out.push_str(&format!("path = \"{}\"\n", toml_escape(path))); - out.push_str("kind = \"TODO\"\n"); + out.push_str(&format!("kind = \"{}\"\n", kind)); out.push_str("surface = \"TODO\"\n"); @@ } + +fn infer_kind(path: &str) -> &'static str { + let ext = Path::new(path).extension().and_then(|s| s.to_str()); + match ext { + Some("md" | "rst" | "txt") => "documentation", + Some("toml" | "yaml" | "yml" | "json") => "configuration", + Some("sh" | "ps1" | "bat") => "script", + _ => "TODO", + } +}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for path in unreceipted {

out.push_str("[[file]]\n");

out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));

out.push_str("kind = \"TODO\"\n");

out.push_str("surface = \"TODO\"\n");

out.push_str("classification = \"TODO\"\n");

out.push_str("owner = \"TODO\"\n");

for path in unreceipted {

let kind = infer_kind(path);

out.push_str("[[file]]\n");

out.push_str(&format!("path = \"{}\"\n", toml_escape(path)));

out.push_str(&format!("kind = \"{}\"\n", kind));

out.push_str("surface = \"TODO\"\n");

out.push_str("classification = \"TODO\"\n");

out.push_str("owner = \"TODO\"\n");

}

fn infer_kind(path: &str) -> &'static str {

let ext = Path::new(path).extension().and_then(|s| s.to_str());

match ext {

Some("md" | "rst" | "txt") => "documentation",

Some("toml" | "yaml" | "yml" | "json") => "configuration",

Some("sh" | "ps1" | "bat") => "script",

_ => "TODO",

}

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@xtask/src/propose.rs` around lines 207 - 213, The loop that generates [[file]] entries currently writes kind = "TODO" for every path; change it to compute a kind by inspecting the file extension (e.g., call or implement an extension heuristic helper like infer_kind_from_extension(path) or match on Path::extension()) and write that value instead of the hardcoded "TODO"; update the block around the for path in unreceipted loop (the out.push_str calls that write kind = ...) to use the computed kind string (escaped/quoted like the other fields) so generated entries are prefilled by extension-based heuristics.

factory-droid · 2026-05-11T23:18:32Z

+    out.push_str(&format!(
+        "Generated by `cargo xtask non-rust propose` on {}.\n\n",
+        today
+    ));
+    out.push_str("## How to use\n\n");
+    out.push_str(&format!(


[P1] Malformed Markdown link — {PROPOSED_TOML} is a literal string, not a variable interpolation

The format argument passed to push_str uses {PROPOSED_TOML} inside the string literal, but push_str does not perform format! interpolation. This means the generated non-rust-proposal.md will contain the literal text {PROPOSED_TOML} instead of non-rust-proposed-allowlist.toml, producing a broken Markdown reference like [non-rust-proposed-allowlist.toml](./{PROPOSED_TOML}). This is a one-line fix: either remove the braces entirely (since the string is a literal, not a format string) or change push_str to push_str(&format!(...)) with proper {} interpolation.

Suggested change

out.push_str(&format!(

"Generated by `cargo xtask non-rust propose` on {}.\n\n",

today

));

out.push_str("## How to use\n\n");

out.push_str(&format!(

out.push_str("1. Review each draft entry in [`non-rust-proposed-allowlist.toml`](./non-rust-proposed-allowlist.toml).\n");

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

factory-droid Bot reviewed May 11, 2026

View reviewed changes

EffortlessSteven merged commit 72ba257 into main May 11, 2026
25 checks passed

EffortlessSteven deleted the feat/xtask-non-rust-propose-20260511 branch May 11, 2026 23:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(policy): propose non-Rust allowlist entries (#205, rollout PR 6/12)#218

feat(policy): propose non-Rust allowlist entries (#205, rollout PR 6/12)#218
EffortlessSteven merged 1 commit into
mainfrom
feat/xtask-non-rust-propose-20260511

EffortlessSteven commented May 11, 2026

Uh oh!

gemini-code-assist Bot commented May 11, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 11, 2026

Uh oh!

coderabbitai Bot commented May 11, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

factory-droid Bot commented May 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 11, 2026

Uh oh!

coderabbitai Bot May 11, 2026

Uh oh!

factory-droid Bot May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

EffortlessSteven commented May 11, 2026

Summary

Issue

Behavior

Decisions

Live result on this branch

Acceptance

Follow-ups

Uh oh!

gemini-code-assist Bot commented May 11, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 11, 2026

Uh oh!

coderabbitai Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

factory-droid Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

factory-droid Bot May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 11, 2026 •

edited

Loading

factory-droid Bot commented May 11, 2026 •

edited

Loading