feat(policy): check workflow, process, and network surfaces (#207, rollout PR 8/12)#220
Conversation
Eighth PR in the 12-PR file-policy rollout. Three new subcommands that
together cover the high-risk half of file policy: which workflows are
authorized, what commands they may invoke, and what endpoints they may
contact.
## Commands
- `cargo xtask check-workflow-surfaces`
Universe: tracked `.github/workflows/*.yml`. Each must have a
`[[workflow]]` entry in `policy/workflow-allowlist.toml`. Each entry
must name a `process_policy` and `network_policy` that exist in
their respective ledgers. Entries with `kind = "dependabot_config"`
are catalog-only and exempt from the workflow-file reconciliation;
their receipts are still validated (missing fields, expired, stale).
- `cargo xtask check-process-policy`
For each receipted workflow (excluding `kind = "dependabot_config"`),
scan its file content for known shell-command tokens; flag tokens
not present in the workflow's declared `process` profile's
`allowed_processes`.
- `cargo xtask check-network-policy`
For each receipted workflow, regex-scan for `https?://<host>` URLs
and reconcile hostnames against the declared `network` profile's
`allowed_endpoints`. Subdomain coverage is recognized (e.g.,
`raw.githubusercontent.com` covers `objects.githubusercontent.com`).
All three accept `--mode advisory|blocking-allowlist|blocking-strict`.
## Decisions
- **Grep heuristics, not YAML/AST parsing.** The PR spec explicitly
said "start simple". The recognition vocabulary for `check-process-
policy` is a fixed list of well-known shell commands; tokens outside
that list are silently ignored. This produces false positives where
a command name appears as a build target or argument (e.g., the word
"shipper" in `cargo build -p shipper`). Advisory mode reports these;
CI does not block on them. Refining the detector (e.g., requiring
the token to appear inside a `run:` block) is a follow-up.
- **`dependabot_config` skip** in workflow-surfaces / process / network
checks. The `.github/dependabot.yml` entry rides along in the
workflow allowlist for catalog purposes (see #215 PR body), but it
is not a workflow file and has no shell commands or URLs to scan.
- **Subdomain endpoint matching.** `allowed_endpoints = ["github.com"]`
covers `api.github.com`, `objects.githubusercontent.com`, etc.,
rather than requiring every host be enumerated. Tighter modes
remain available via the explicit-only `allowed_endpoints`.
- **New direct dep: `regex = "1"`.** Used only by `check-network-policy`
for the host extraction pattern. Could be replaced by hand-rolled
parsing if dep weight becomes an issue.
## Acceptance
- `cargo check --workspace --locked` passes.
- `cargo clippy -p xtask --all-targets --locked -- -D warnings` clean.
- `cargo fmt --all -- --check` clean.
- Live results on current main:
check-workflow-surfaces: workflows=9 entries=10 unreceipted=0 missing=0 expired=0 stale=0 unused=0 invalid_refs=0
check-process-policy: workflows=9 unknown_total=10 (all false positives — see PR body)
check-network-policy: workflows=9 unknown_total=0
## Out of scope
- Unified policy-report (PR 9 / #208)
- CI advisory wiring (PR 10 / #209)
- Promotion to blocking-allowlist (PR 11 / #210, PR 12 / #211)
- Refining the process-detector to suppress build-target false positives.
- DRY-up of the shared `workspace_root`, `today_iso`, `date_is_past`,
`tracked_workflow_files` helpers across xtask modules.
Closes #207.
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary by CodeRabbit
WalkthroughThis PR introduces workflow policy checking to the xtask CLI with three validations: reconciling tracked workflows against an allowlist, scanning for undeclared process commands, and scanning for undeclared network endpoints. Each check loads YAML workflows and TOML policies, generates reports, and conditionally fails in blocking modes. ChangesWorkflow Policy Checking Feature
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Droid finished @EffortlessSteven's task —— View job Droid review complete — PR 220 validated. No actionable findings emitted. Inspected surfaces:
Checks performed:
Why no comments: Residual risk: Validation signal: Review summary: approved — no issues found. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@xtask/src/workflow_checks.rs`:
- Around line 546-572: Detected hostnames are compared case-sensitively
currently; normalize to lowercase at the boundary so matching works (e.g., when
host_re captures in the loop and when building the allowed set). Change the
insertion in the capture loop (where detected.insert(host.as_str().to_string())
is called) to insert host.as_str().to_lowercase(), and when constructing the
allowed BTreeSet from profiles_by_name (allowed_endpoints), map each endpoint to
.to_lowercase() so endpoint_covered and subsequent comparisons operate on
lowercased strings; the same normalization should be applied in the other
capture loop referenced around lines 617-625.
- Around line 156-160: The Vec dependabot_entries is allocated from all_entries
via filter+cloned but never used; either remove its creation entirely or
actually consume it—e.g., compute dependabot_entries.len() and store that count
in WorkflowSummary (update the WorkflowSummary struct and where it’s
constructed) or emit the count somewhere relevant; to fix, delete the
dependabot_entries binding and any let _ = dependabot_entries; lines if you
choose removal, or replace the unused Vec with a usize count (use
is_dependabot_config on all_entries to compute the count) and propagate that
value into WorkflowSummary construction and types where necessary.
- Around line 5-9: The module doc promises `.github/dependabot.yml` is included
but the collection predicate in tracked_workflow_files only matches paths
starting with ".github/workflows/" and ending with ".yml", so dependabot is
never added to the workflows set; update the predicate used where
tracked_workflow_files is built (the p.starts_with(...) && p.ends_with(...)
check) to also include the literal ".github/dependabot.yml" (e.g. OR p ==
".github/dependabot.yml" or equivalent), and apply the same change to the other
occurrence of the same filter so the unreceipted reconciliation will flag
missing dependabot receipts; alternatively, if you prefer behavior change
instead of code, edit the module doc string to remove the claim about dependabot
instead of changing tracked_workflow_files.
- Around line 617-625: The function endpoint_covered currently allows an allowed
entry to be a subdomain of the detected host (via the condition
a.ends_with(&format!(".{}", host))), which reverses the intended policy; change
endpoint_covered to only return true when allowed contains host or when host is
a subdomain of an allowed entry (i.e., keep allowed.contains(host) and the
host.ends_with(&format!(".{}", a)) check) and remove the a.ends_with(...) clause
so allowed child subdomains cannot incorrectly cover parent domains.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2dd5e4eb-857e-45a4-a525-24576bbc12fb
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
xtask/Cargo.tomlxtask/src/main.rsxtask/src/workflow_checks.rs
| //! - `cargo xtask check-workflow-surfaces` — every `.github/workflows/*.yml` | ||
| //! (and `.github/dependabot.yml`) must be receipted in | ||
| //! `policy/workflow-allowlist.toml`. Each entry must name a | ||
| //! `process_policy` and `network_policy` that exist in their respective | ||
| //! ledgers. |
There was a problem hiding this comment.
Module doc claims .github/dependabot.yml is part of receipted workflows, but it is never collected.
The header doc says every .github/workflows/*.yml and .github/dependabot.yml must be receipted, but tracked_workflow_files filters to p.starts_with(".github/workflows/") && p.ends_with(".yml"), so .github/dependabot.yml is never in the workflows set and the unreceipted reconciliation cannot flag a missing dependabot receipt. Either tighten the filter to also include .github/dependabot.yml, or amend the doc so it describes the actual behavior (catalog-only validation via the dependabot_config entry).
Also applies to: 629-654
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@xtask/src/workflow_checks.rs` around lines 5 - 9, The module doc promises
`.github/dependabot.yml` is included but the collection predicate in
tracked_workflow_files only matches paths starting with ".github/workflows/" and
ending with ".yml", so dependabot is never added to the workflows set; update
the predicate used where tracked_workflow_files is built (the p.starts_with(...)
&& p.ends_with(...) check) to also include the literal ".github/dependabot.yml"
(e.g. OR p == ".github/dependabot.yml" or equivalent), and apply the same change
to the other occurrence of the same filter so the unreceipted reconciliation
will flag missing dependabot receipts; alternatively, if you prefer behavior
change instead of code, edit the module doc string to remove the claim about
dependabot instead of changing tracked_workflow_files.
| let dependabot_entries: Vec<RawWorkflowEntry> = all_entries | ||
| .iter() | ||
| .filter(|e| is_dependabot_config(e)) | ||
| .cloned() | ||
| .collect(); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Drop dependabot_entries or actually consume it; today it is allocated only to be discarded.
dependabot_entries is built by a full filter+clone over all_entries and is then explicitly silenced via let _ = dependabot_entries;. The comment says it is preserved "for future per-kind audits", but until that audit exists this is dead allocation and obscures intent — every reader has to figure out why a populated Vec is being thrown away. Either remove it now and reintroduce when the audit lands, or use it (e.g., emit a dependabot_entries count in WorkflowSummary).
Also applies to: 271-271
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@xtask/src/workflow_checks.rs` around lines 156 - 160, The Vec
dependabot_entries is allocated from all_entries via filter+cloned but never
used; either remove its creation entirely or actually consume it—e.g., compute
dependabot_entries.len() and store that count in WorkflowSummary (update the
WorkflowSummary struct and where it’s constructed) or emit the count somewhere
relevant; to fix, delete the dependabot_entries binding and any let _ =
dependabot_entries; lines if you choose removal, or replace the unused Vec with
a usize count (use is_dependabot_config on all_entries to compute the count) and
propagate that value into WorkflowSummary construction and types where
necessary.
| let host_re = | ||
| Regex::new(r"https?://([A-Za-z0-9.\-]+)").context("compiling network hostname regex")?; | ||
|
|
||
| let mut per_workflow = Vec::new(); | ||
| let mut unknown_total = 0usize; | ||
| for e in &entries { | ||
| if is_dependabot_config(e) { | ||
| // dependabot.yml is configuration, not a script — no URLs to scan. | ||
| continue; | ||
| } | ||
| let path = match &e.path { | ||
| Some(p) => p, | ||
| None => continue, | ||
| }; | ||
| let profile = e.network_policy.clone().unwrap_or_default(); | ||
| let allowed: BTreeSet<String> = profiles_by_name | ||
| .get(&profile) | ||
| .map(|p| p.allowed_endpoints.iter().cloned().collect()) | ||
| .unwrap_or_default(); | ||
|
|
||
| let content = read_workflow_content(&workspace_root, path).unwrap_or_default(); | ||
| let mut detected: BTreeSet<String> = BTreeSet::new(); | ||
| for caps in host_re.captures_iter(&content) { | ||
| if let Some(host) = caps.get(1) { | ||
| detected.insert(host.as_str().to_string()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Hostname comparison is case-sensitive; normalize to lowercase.
DNS hostnames are case-insensitive, but detected.insert(host.as_str().to_string()) and the comparisons inside endpoint_covered are byte-exact. A workflow that contains https://GitHub.com/... will be flagged as an unknown endpoint even if github.com is in the profile. Normalize both sides at the boundary so the check matches operator expectations.
🛡️ Proposed fix
let allowed: BTreeSet<String> = profiles_by_name
.get(&profile)
- .map(|p| p.allowed_endpoints.iter().cloned().collect())
+ .map(|p| {
+ p.allowed_endpoints
+ .iter()
+ .map(|s| s.to_ascii_lowercase())
+ .collect()
+ })
.unwrap_or_default();
let content = read_workflow_content(&workspace_root, path).unwrap_or_default();
let mut detected: BTreeSet<String> = BTreeSet::new();
for caps in host_re.captures_iter(&content) {
if let Some(host) = caps.get(1) {
- detected.insert(host.as_str().to_string());
+ detected.insert(host.as_str().to_ascii_lowercase());
}
}Also applies to: 617-625
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@xtask/src/workflow_checks.rs` around lines 546 - 572, Detected hostnames are
compared case-sensitively currently; normalize to lowercase at the boundary so
matching works (e.g., when host_re captures in the loop and when building the
allowed set). Change the insertion in the capture loop (where
detected.insert(host.as_str().to_string()) is called) to insert
host.as_str().to_lowercase(), and when constructing the allowed BTreeSet from
profiles_by_name (allowed_endpoints), map each endpoint to .to_lowercase() so
endpoint_covered and subsequent comparisons operate on lowercased strings; the
same normalization should be applied in the other capture loop referenced around
lines 617-625.
| fn endpoint_covered(host: &str, allowed: &BTreeSet<String>) -> bool { | ||
| // Exact match, or `host` is a subdomain of an allowed entry. | ||
| if allowed.contains(host) { | ||
| return true; | ||
| } | ||
| allowed.iter().any(|a| { | ||
| host == a || host.ends_with(&format!(".{}", a)) || a.ends_with(&format!(".{}", host)) | ||
| }) | ||
| } |
There was a problem hiding this comment.
endpoint_covered lets allowed child subdomains cover detected parent domains, weakening the policy.
The third condition — a.ends_with(&format!(".{}", host)) — treats an allowed entry as covering a detected host whenever the allowed entry is a subdomain of the detected host. So if the network profile lists api.github.com, a workflow that talks to bare github.com is silently accepted, even though authorization for api.github.com clearly does not authorize the parent. This contradicts the documented intent ("github.com covers api.github.com") and is asymmetric in the wrong direction for an allowlist.
🔒 Proposed fix
fn endpoint_covered(host: &str, allowed: &BTreeSet<String>) -> bool {
- // Exact match, or `host` is a subdomain of an allowed entry.
+ // Exact match, or `host` is a subdomain of an allowed entry.
+ // Note: an allowed *child* must NOT cover a detected *parent*; coverage
+ // only flows from broader (allowed) to narrower (detected).
if allowed.contains(host) {
return true;
}
allowed
.iter()
- .any(|a| {
- host == a || host.ends_with(&format!(".{}", a)) || a.ends_with(&format!(".{}", host))
- })
+ .any(|a| host == a || host.ends_with(&format!(".{}", a)))
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@xtask/src/workflow_checks.rs` around lines 617 - 625, The function
endpoint_covered currently allows an allowed entry to be a subdomain of the
detected host (via the condition a.ends_with(&format!(".{}", host))), which
reverses the intended policy; change endpoint_covered to only return true when
allowed contains host or when host is a subdomain of an allowed entry (i.e.,
keep allowed.contains(host) and the host.ends_with(&format!(".{}", a)) check)
and remove the a.ends_with(...) clause so allowed child subdomains cannot
incorrectly cover parent domains.
Ninth PR in the 12-PR file-policy rollout. Adds `cargo xtask
policy-report` which runs every advisory check and aggregates the
results into target/policy/policy-report.{md,json}.
## Behavior
policy-report invokes each of the seven check functions in advisory
mode:
check_file_policy::check
checks::check_generated
checks::check_executable_files
checks::check_dependency_surfaces
workflow_checks::check_workflow_surfaces
workflow_checks::check_process_policy
workflow_checks::check_network_policy
Each sub-check writes its own target/policy/*-report.json (as in
previous PRs). policy-report then reads back the seven JSON artifacts,
lifts each one's `summary` block, picks a headline metric per area
(unreceipted | invalid_policy_refs | unknown_total | missing_fields |
expired | stale | unused — first non-zero wins; falls back to a
universe-size-keyed "clean" row), and emits a unified MD + JSON.
## Decisions
- **Advisory only.** No --mode flag. Promotion to blocking is the job
of PRs 10/11/12; they tighten CI's invocation of the individual
checks. Adding modes here would duplicate state.
- **Re-run vs read-only.** policy-report always re-runs the sub-checks
so the aggregated report reflects current state, not stale artifacts
from a prior session. Cost is small — each sub-check is a few
hundred ms.
- **Untyped JSON re-read for aggregation.** Each sub-report has a
different schema. Lifting the `summary` block as `serde_json::Value`
is simpler and more robust than maintaining seven typed deserializers
in xtask. The headline metric is computed from the same Value.
## Acceptance
- cargo check --workspace --locked passes.
- cargo clippy -p xtask --all-targets --locked -- -D warnings clean.
- cargo fmt --all -- --check clean.
- `cargo xtask policy-report` ran on this branch:
wrote unified policy-report (7 areas, 7 headline rows)
- target/policy/policy-report.md and target/policy/policy-report.json
both produced.
Headline on current main:
| Non-Rust file policy | unreceipted | 945 |
| Generated files | _clean_ | 0 (universe) |
| Executable files | _clean_ | 0 (universe) |
| Dependency surfaces | _clean_ | 16 (universe) |
| Workflow surfaces | _clean_ | 9 (universe) |
| Process policy | unknown_total | 10 |
| Network policy | _clean_ | 9 (universe) |
The two non-clean rows match the documented expected state: 945
unreceipted files because PR 2's seeded receipts intentionally cover
only a slice of the workspace, and 10 false-positive process tokens
because the PR 8 detector is grep-style (see #220 PR body OPEN
question).
## Out of scope
- PR 10 (#209) — wire the advisory checks into CI as a job and upload
target/policy/ as an artifact.
- PR 11 (#210) — promote file/generated/executable/dependency/workflow
checks to blocking.
- PR 12 (#211) — promote process/network checks to blocking.
Closes #208.
Summary
Eighth PR in the 12-PR file-policy rollout. High-risk half. Three new xtask subcommands cover which workflows are authorized, what commands they may invoke, and what endpoints they may contact.
Issue
Closes #207. Depends on #203 (workflow/process/network ledgers, merged), #212 (xtask scaffold), #206 (gen/exec/dep checks for the model). Refines #180. Tracks #109.
Commands
cargo xtask check-workflow-surfaces.github/workflows/*.ymlpolicy/workflow-allowlist.tomlcargo xtask check-process-policypolicy/process-allowlist.tomlprofilescargo xtask check-network-policypolicy/network-allowlist.tomlprofilesAll three accept
--mode advisory|blocking-allowlist|blocking-strict. Reports land attarget/policy/<name>-report.{md,json}.Decisions
check-process-policyis a fixed list of well-known shell commands; tokens outside that list are silently ignored. Refining is a follow-up.kind = "dependabot_config"skip in all three workflow-content checks. The.github/dependabot.ymlentry rides along in the workflow allowlist for catalog purposes (see chore(policy): receipt high-risk non-Rust surfaces (#203, rollout PR 3/12) #215 PR body) but it is not a workflow file and has no shell commands or URLs to scan. Its receipt is still validated (missing fields, expired, stale).allowed_endpoints = ["github.com"]coversapi.github.com,objects.githubusercontent.com, etc. Tighter posture remains available via explicit-only lists.regex = "1"— used only bycheck-network-policyfor host extraction.On the 10 "unknown commands"
All ten are false positives at the recognition layer. Examples:
ci.ymlciinstall,shipper,sudosudoappears in a comment or echoed string.coverage.ymlcimkdir,python3,shippercargo-invoked subcommands or output, not as direct commands.release.ymlreleasemkdirreleaseprofile lists the right commands;mkdirappears as a build-step name.This is exactly the OPEN question the user flagged for PR 8:
Advisory mode is the right home for these. A follow-up can refine the detector (require token to appear in a
run:step's content, not inname:/description:/ comment lines).Acceptance
cargo check --workspace --lockedpasses.cargo clippy -p xtask --all-targets --locked -- -D warningsclean.cargo fmt --all -- --checkclean.target/policy/<name>-report.{md,json}.Follow-ups
policy-reportaggregates all 7 checker reports.target/policy/as an artifact.run:block content).workspace_root,today_iso,date_is_past,tracked_workflow_fileshelpers across xtask modules.