Integration/v1 testing by bglusman · Pull Request #1 · bglusman/calciforge

bglusman · 2026-04-09T11:13:21Z

Description

Brief description of the changes in this PR.

Type of Change

Testing

cargo test passes for all affected crates
cargo clippy passes with no warnings
cargo fmt is clean
New tests added for new functionality

Checklist

Code follows the project style guidelines
Self-review completed
Comments added for complex logic
Documentation updated (if applicable)
No new warnings introduced

Related Issues

Fixes # (issue number)

Breaking Changes

List any breaking changes and migration steps:

Additional Notes

Any additional context or screenshots.

Remove unused imports, dead code, and fix lint issues: - Remove genuinely unused functions, methods, types, and modules - Gate test-only code behind #[cfg(test)] - Prefix unused struct fields with _ (with serde rename where needed) - Fix derivable_impls, needless_borrows, collapsible_if, manual_strip - Fix deprecated DateTime::from_utc and telegram msg.from() calls - Fix unused_doc_comments on proptest macros All 329+ tests pass. Zero clippy warnings with -D warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds an integration-testing/policy-enforcement stack around OpenClaw tool calls by introducing a new clashd policy sidecar, an OpenClaw policy plugin, supporting scripts, and related e2e test cleanups.

Changes:

Introduce crates/clashd (Axum service) with /evaluate and /health endpoints plus Docker packaging.
Add crates/zeroclawed-policy-plugin TypeScript OpenClaw plugin + local compose setup and requirement-check script.
Add helper scripts for local mock LLM testing, syncing OpenClaw policy config, and deploying builds to a fixed host; minor formatting updates across existing Rust e2e tests and onecli-client.

Reviewed changes

Copilot reviewed 30 out of 32 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
scripts/sync-openclaw-policy.sh	Installs the OpenClaw policy plugin and additively patches OpenClaw config + policy file sync
scripts/mock-llm.py	Flask-based mock LLM server for integration tests
scripts/docker-compose.yml	Switch mock-llm compose service to mounting `mock-llm.py` instead of inline heredoc
scripts/deploy-210.sh	Deployment helper script targeting 192.168.1.210 with systemd setup and policy deploy
PR_STACK.md	Documentation describing intended PR stacking/merge order and feature contents
crates/zeroclawed/tests/e2e/property_tests.rs	Formatting-only changes to property tests
crates/zeroclawed/tests/e2e/onecli_proxy.rs	Formatting + minor readability tweaks in proxy e2e tests
crates/zeroclawed/tests/e2e/mod.rs	Reorder module declarations
crates/zeroclawed/tests/e2e/config_sanity.rs	Formatting/readability tweaks in config sanity tests
crates/zeroclawed/Cargo.toml	Add `sled` dependency
crates/zeroclawed-policy-plugin/tsconfig.json	TypeScript compiler configuration for the policy plugin
crates/zeroclawed-policy-plugin/src/index.ts	OpenClaw plugin entry: calls `clashd` in `before_tool_call` and blocks/requires approval
crates/zeroclawed-policy-plugin/README.md	Plugin documentation (requirements, install, usage)
crates/zeroclawed-policy-plugin/package.json	Plugin packaging metadata + OpenClaw extension/hook declarations
crates/zeroclawed-policy-plugin/openclaw.plugin.json	Plugin manifest and config schema
crates/zeroclawed-policy-plugin/docker-compose.yml	Compose stack for testing clashd + a mock OpenClaw service
crates/zeroclawed-policy-plugin/check-requirements.sh	Checks OpenClaw version and clashd availability
crates/zeroclawed-policy-plugin/before_tool_call/tsconfig.json	Separate TS config for standalone hook handler variant
crates/zeroclawed-policy-plugin/before_tool_call/index.ts	Standalone before_tool_call hook handler implementation
crates/zeroclawed-policy-plugin/before_tool_call/index.js	Committed compiled JS for the standalone hook handler
crates/zeroclawed-policy-plugin/before_tool_call/HOOK.md	Documentation for the standalone hook handler
crates/onecli-client/src/vault.rs	Minor formatting changes
crates/onecli-client/src/policy.rs	Minor formatting changes
crates/onecli-client/src/main.rs	Minor formatting/refactor for readability (imports/log formatting)
crates/onecli-client/src/config.rs	Minor formatting changes
crates/clashd/src/main.rs	New clashd server implementation with hardcoded policy rules
crates/clashd/README.md	Documentation for clashd API and behavior
crates/clashd/Dockerfile	Multi-stage build for clashd container image
crates/clashd/Cargo.toml	New crate manifest for clashd
config/policy.star	New Starlark policy file (intended for future/advanced policy engine)
Cargo.toml	Add `crates/clashd` to the workspace members
Cargo.lock	Lockfile updates for new crate/dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T11:17:39Z

+# Backup config
+cp "$OPENCLAW_CONFIG" "$OPENCLAW_CONFIG.bak.$(date +%Y%m%d-%H%M%S)"
+


The script backs up $OPENCLAW_CONFIG unconditionally, but if the config file doesn’t exist yet cp will fail and (with set -e) the script aborts before the Python block can emit the intended "config not found" error. Add an existence check (or move backup into the Python block after successfully reading the config) to keep the script idempotent.

Copilot · 2026-04-09T11:17:39Z

+    data = request.get_json()
+    last_request = data
+
+    # Check if tools were sent (this is what we want to verify)
+    has_tools = 'tools' in data
+


request.get_json() can return None (e.g., invalid/missing JSON or wrong Content-Type). The subsequent 'tools' in data will then raise a TypeError and the mock server will 500. Handle None by defaulting to {} or returning a 400 with a clear message.

Copilot · 2026-04-09T11:17:39Z

+# Build locally first
+echo "📦 Building locally..."
+cargo build --release --workspace 2>&1 | tail -5


cargo build ... | tail -5 masks build failures unless pipefail is enabled (the pipeline exit code is from tail). This can lead to deploying stale or missing binaries. Either remove the pipe, or add set -o pipefail (or set -euo pipefail) and avoid truncating error output.

Copilot · 2026-04-09T11:17:40Z

+# deploy-210.sh - Deploy ZeroClawed integration build to 192.168.1.210
+# Usage: ./scripts/deploy-210.sh [--branch BRANCH] [--build]
+
+set -e
+
+HOST="192.168.1.210"
+USER="root"
+SSH_KEY="${SSH_KEY:-$HOME/.ssh/id_ed25519}"
+BRANCH="${1:-integration/v1-testing}"


The usage header advertises --branch / --build, but the script currently treats $1 as the branch name and doesn’t parse flags at all. This is misleading and will break callers who pass --branch. Either implement basic flag parsing or update the usage comment + variable initialization to match actual behavior.

Copilot · 2026-04-09T11:17:40Z

+    agent = context.get("agent", "unknown")
+
+    # Get target from args (for tools like gateway config edits)
+    target = args.get("path", "") or args.get("config", {{}}).get("agentId", "")


This Starlark policy has args.get("config", {{}}) which is not valid syntax (double-braced dict literal). It will prevent the policy from loading/executing. Replace {{}} with {} (and consider a minimal smoke test/CI check that the policy file parses).

Suggested change

target = args.get("path", "") or args.get("config", {{}}).get("agentId", "")

target = args.get("path", "") or args.get("config", {}).get("agentId", "")

Copilot · 2026-04-09T11:17:40Z

+  register(api) {
+    const config: PolicyConfig = {
+      ...DEFAULT_CONFIG,
+      // Could load from plugin config store in future
+    };
+


This plugin defines a config schema (clashdEndpoint/timeoutMs/fallbackOnError) but the runtime code only reads environment variables and ignores any per-plugin configuration provided by OpenClaw. Either wire the OpenClaw plugin config into config here, or remove/update the schema/docs to avoid a non-functional configuration surface.

Copilot · 2026-04-09T11:17:41Z

+"use strict";
+// before_tool_call hook handler
+// This is invoked by OpenClaw before each tool execution
+var __awaiter = (this && this.__awaiter) || function (thisArg, _arguments, P, generator) {
+    function adopt(value) { return value instanceof P ? value : new P(function (resolve) { resolve(value); }); }


This before_tool_call/index.js appears to be a compiled artifact (TS helper boilerplate) and the package manifest doesn’t reference it (the plugin loads ./dist/index.js). Keeping generated JS in-repo alongside a second TS implementation is confusing and risks divergence. Consider removing this folder, or clearly wiring it into the build/package and adding the generated output to .gitignore instead.

Copilot · 2026-04-09T11:17:41Z

 harness = true

 [dependencies]
+sled = "0.34"


sled is added as a dependency but there are no uses of it in the repository. If it’s not needed for this PR, please remove it to avoid pulling in an extra storage engine (compile time, audit surface). If it is needed, include the corresponding code changes so the dependency is justified.

Suggested change

sled = "0.34"

Copilot · 2026-04-09T11:17:42Z

+    // Check for destructive operations
+    let destructive_patterns = ["rm -rf", "rm -fr", "mkfs", "wipefs", "dd if=/dev/"];
+    for pattern in &destructive_patterns {
+        if args_str.contains(pattern) {
+            warn!(pattern, "Destructive command detected");
+            return Ok(Json(EvaluateResponse {
+                verdict: "deny".to_string(),
+                reason: Some(format!(
+                    "Destructive command '{}' blocked by policy",
+                    pattern
+                )),
+            }));


The destructive-command detection denies any request whose serialized args contain patterns like rm -rf, regardless of tool. The README describes these as "destructive shell commands"; as written, non-shell tools could be denied if their args happen to include these substrings. If the intent is shell-only blocking, gate this block on the tool (e.g., tool == "shell").

Suggested change

// Check for destructive operations

let destructive_patterns = ["rm -rf", "rm -fr", "mkfs", "wipefs", "dd if=/dev/"];

for pattern in &destructive_patterns {

if args_str.contains(pattern) {

warn!(pattern, "Destructive command detected");

return Ok(Json(EvaluateResponse {

verdict: "deny".to_string(),

reason: Some(format!(

"Destructive command '{}' blocked by policy",

pattern

)),

}));

// Check for destructive shell operations

if tool_lower == "shell" {

let destructive_patterns = ["rm -rf", "rm -fr", "mkfs", "wipefs", "dd if=/dev/"];

for pattern in &destructive_patterns {

if args_str.contains(pattern) {

warn!(tool = %req.tool, pattern, "Destructive command detected");

return Ok(Json(EvaluateResponse {

verdict: "deny".to_string(),

reason: Some(format!(

"Destructive command '{}' blocked by policy",

pattern

)),

}));

}

Copilot · 2026-04-09T11:17:42Z

+## Version Check During Installation
+
+The plugin checks OpenClaw version on load and logs a warning if requirements aren't met:
+
+```
+[zeroclawed-policy] WARNING: OpenClaw version 2026.3.20 detected.
+[zeroclawed-policy] This plugin requires >=2026.3.24-beta.2 for requireApproval support.
+[zeroclawed-policy] Policy enforcement will not work correctly.
+```


This section claims the plugin checks the OpenClaw version on load and logs a warning when requirements aren’t met, but the current implementation doesn’t perform any version detection/checking. Either implement the runtime version check or update/remove this documentation to match actual behavior.

Suggested change

## Version Check During Installation

The plugin checks OpenClaw version on load and logs a warning if requirements aren't met:

```

[zeroclawed-policy] WARNING: OpenClaw version 2026.3.20 detected.

[zeroclawed-policy] This plugin requires >=2026.3.24-beta.2 for requireApproval support.

[zeroclawed-policy] Policy enforcement will not work correctly.

```

## Version Compatibility

This plugin requires OpenClaw `>=2026.3.24-beta.2` because the approval flow depends on `requireApproval` support in hook results.

The current plugin implementation does not automatically detect or warn about the OpenClaw version at load time, so compatibility must be verified during installation or troubleshooting.

Security-proxy now shares onecli-client's vault resolver rather than its own parallel startup-time env scan. Per-request, on cache miss, it consults env → fnox → vaultwarden and caches the result. This fixes the silently-broken rotation behaviour documented in `docs/rfcs/consolidation-findings.md` finding #5: previously, a rotated key added to env or fnox after security-proxy started was invisible until restart. Changes: - Add `onecli-client = { path = "../onecli-client" }` as a path dep. - New `CredentialInjector::ensure_cached(provider) -> bool` that returns fast when the DashMap already has the provider, else calls `onecli_client::vault::get_secret(provider)` and inserts on success. - Public wrapper `detect_provider_pub(host)` so the proxy hot path can determine the provider name without duplicating the pattern table. - Proxy.rs resolves + caches before calling inject() on each request. - Two behavior tests covering cache-hit (no resolver call) and nothing-resolves (cache stays clean, no `Bearer ""` footgun). Written in given/when/then doc form as per today's test-quality review discussion. Left for follow-up commits on this branch: - Merge onecli-client's `/vault/:secret` + `/proxy/:provider` routes into security-proxy (move the handlers, then delete main.rs). - Reconcile env-var conventions (ZEROGATE_KEY_* vs <NAME>_API_KEY — finding #1). - Per-provider auth header schemes (finding #3 — onecli only does Bearer; security-proxy has anthropic's x-api-key path). - Remove hardcoded `vault.enjyn.com` fallback (finding #6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…icy docs Two small cleanups driven by the test audit and consolidation findings: 1. Finding #1: `load_from_env` now emits a tracing::warn log whenever a `ZEROGATE_KEY_*` env var is encountered, nudging operators to the standard `<NAME>_API_KEY` convention that the shared resolver understands. The old path still populates the cache so existing deployments don't break; a future PR removes it. 2. Audit finding: `ensure_cached`'s doc-comment claimed "first-write-wins" but the covered `test_add_overwrites` test proves `add()` is last-write-wins. Both are true — `ensure_cached` itself never overwrites, `add` always does. The doc is rewritten to state the contract precisely: rotation via direct `add()` takes effect immediately, rotation via env/fnox/vault does NOT until the cache entry is cleared or the service restarts. TTL/invalidation flagged as follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Librarian and others added 2 commits April 8, 2026 23:06

fix(plugin): add hook handler index.ts/js for before_tool_call

65df39a

Copilot AI review requested due to automatic review settings April 9, 2026 11:13

Copilot started reviewing on behalf of bglusman April 9, 2026 11:14 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Librarian added 4 commits April 9, 2026 15:56

docs: add ROADMAP.md

55a5654

chore: add node_modules to .gitignore

de57e23

fix(outpost): resolve test compilation errors

2778352

test(outpost): ignore tests with unvalidated scanner logic

b8af38a

bglusman force-pushed the integration/v1-testing branch from 77c14bc to b8af38a Compare April 9, 2026 15:57

style: apply cargo fmt fixes

fdab319

bglusman merged commit a6baf69 into main Apr 9, 2026
9 checks passed

This was referenced Apr 24, 2026

refactor: consolidate onecli binary into security-proxy (#28) #17

Closed

rename: zeroclawed → calciforge + drop dead OneCLI HTTP client #48

Merged

bglusman added a commit that referenced this pull request Apr 25, 2026

Merge pull request #1 from bglusman/integration/v1-testing

e8c1de0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration/v1 testing#1

Integration/v1 testing#1
bglusman merged 7 commits intomainfrom
integration/v1-testing

bglusman commented Apr 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Backup config
		cp "$OPENCLAW_CONFIG" "$OPENCLAW_CONFIG.bak.$(date +%Y%m%d-%H%M%S)"

	target = args.get("path", "") or args.get("config", {{}}).get("agentId", "")
	target = args.get("path", "") or args.get("config", {}).get("agentId", "")

Conversation

bglusman commented Apr 9, 2026

Description

Type of Change

Testing

Checklist

Related Issues

Breaking Changes

Additional Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants