feat: AI model proxy, local model lifecycle, multi-provider routing, voice pipeline by bglusman · Pull Request #10 · bglusman/calciforge

bglusman · 2026-04-20T18:34:58Z

Summary

Adds substantial new capabilities on top of main with a clean single commit rebased directly onto main HEAD:

OpenAI-compatible proxy server — [proxy] config block, POST /v1/chat/completions, GET /v1/models, GET /health
Multi-provider routing — [[proxy.providers]] and [[proxy.model_routes]] pattern-matched to different backend URLs/keys/timeouts
Local model lifecycle — [local_models] manages mlx_lm.server or llama-server processes; POST /control/local/switch hot-swaps models at runtime
Voice pipeline passthrough — [proxy.voice.stt/tts] routes audio to any OpenAI-compatible server; GET /v1/tools/manifest surfaces capabilities as tool definitions; optional shell hooks for pre/post processing
Model alloys — [[alloys]] weighted/round_robin blending across multiple providers
Traceloop and Helicone gateway integrations (feature-gated)
Matrix channel rewritten with raw HTTP, removing the matrix-sdk compile-time dependency conflict
Config validator — --validate flag for config file checking at deploy time
Mock channel for integration testing
Persistent context (optional sqlx feature)

Replaces PR #9 (which had git history conflicts from the filter-branch operation).

Test plan

cargo test — all 18 test suites pass
cargo build --release -p zeroclawed succeeds
--validate validates config and exits cleanly
Proxy routes to named providers based on model pattern
POST /control/local/switch switches local model
GET /v1/tools/manifest reflects configured capabilities
Voice endpoints return 501 when not configured

…ng, and voice pipeline Adds the following new capabilities on top of main: - OpenAI-compatible HTTP proxy server ([proxy] config) - Multi-provider routing via [[proxy.providers]] and [[proxy.model_routes]] - Local model lifecycle management ([local_models] config) with mlx_lm/llama.cpp support and hot-swap via POST /control/local/switch - Voice pipeline passthrough: POST /v1/audio/transcriptions, /v1/audio/speech, GET /v1/tools/manifest with optional shell hooks - Model alloy blending ([[alloys]]) with weighted/round_robin strategies - Traceloop and Helicone gateway integrations - Matrix channel rewritten with raw HTTP (removes matrix-sdk dependency) - Config validator (--validate flag) - Mock channel for testing - Persistent context store (optional feature) - Agent delegation and slash command interception scaffold - exclude loom-tests from default cargo test via workspace default-members

Copilot

Pull request overview

Adds a substantial OpenAI-compatible proxy/voice surface to zeroclawed, plus infrastructure for multi-provider routing, local model lifecycle management, and (feature-gated) persistent conversation context.

Changes:

Introduces a new proxy server (/v1/chat/completions, /v1/models, /health) with provider routing, retries, and optional Helicone/Traceloop integrations.
Adds local model lifecycle management + a voice passthrough layer (STT/TTS forwarding + tool manifest endpoint).
Adds a unified context abstraction and a new persistent SQLite-backed context store (feature-gated).

Reviewed changes

Copilot reviewed 57 out of 58 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
crates/zeroclawed/src/voice/tools.rs	Builds tool manifest JSON based on configured capabilities
crates/zeroclawed/src/voice/mod.rs	Voice config schema for STT/TTS + hooks
crates/zeroclawed/src/voice/forward.rs	Generic HTTP forwarder + hook runner for voice endpoints
crates/zeroclawed/src/unified_context.rs	Unifies in-memory and persistent context store behind a trait/enum
crates/zeroclawed/src/sync.rs	Loom-aware reexports for sync primitives
crates/zeroclawed/src/router.rs	Adds dispatch API supporting optional model override
crates/zeroclawed/src/proxy/voice_handlers.rs	Axum handlers for STT/TTS passthrough + tools manifest
crates/zeroclawed/src/proxy/traceloop/test.rs	Adds Traceloop router unit tests (partial/placeholder)
crates/zeroclawed/src/proxy/traceloop/openai.rs	Traceloop OpenAI-provider implementation
crates/zeroclawed/src/proxy/traceloop/kimi.rs	Kimi provider wrapper via OpenAI-compatible API
crates/zeroclawed/src/proxy/traceloop/deepseek.rs	DeepSeek provider implementation
crates/zeroclawed/src/proxy/traceloop/anthropic.rs	Anthropic provider implementation + response conversion
crates/zeroclawed/src/proxy/streaming.rs	Stub for SSE stream transformation (currently empty)
crates/zeroclawed/src/proxy/routing.rs	Builds provider routing table from config patterns
crates/zeroclawed/src/proxy/retry_gateway.rs	Adds a retrying gateway wrapper (configurable policy)
crates/zeroclawed/src/proxy/retry.rs	Defines retry configuration + backoff iterator
crates/zeroclawed/src/proxy/openai.rs	Adds OpenAI-compatible request/response types
crates/zeroclawed/src/proxy/mod.rs	Proxy server bootstrapping + routes registration
crates/zeroclawed/src/proxy/helicone_router.rs	HTTP-based Helicone gateway integration
crates/zeroclawed/src/proxy/handlers.rs	Core proxy handlers: chat completions, models, health, local switch
crates/zeroclawed/src/proxy/auth_test.rs	Adds auth tests (currently API-mismatched)
crates/zeroclawed/src/proxy/auth.rs	Model access policy evaluation (auth scaffolding partially commented out)
crates/zeroclawed/src/proxy/alloy_router.rs	Back-compat alloy router wrapper around Helicone router
crates/zeroclawed/src/providers/mod.rs	Introduces provider registry + loom tests
crates/zeroclawed/src/providers/alloy.rs	Implements alloy selection strategies and stats
crates/zeroclawed/src/persistent_context.rs	Adds SQLite persistent context store (sqlx)
crates/zeroclawed/src/main.rs	Adds `--validate`, `--proxy-only`, proxy startup, local model autoload, unified context init
crates/zeroclawed/src/local_model/mod.rs	Adds local model manager and switch hooks
crates/zeroclawed/src/local_model/mlx_lm.rs	Spawns/stops `mlx_lm.server` and waits for readiness
crates/zeroclawed/src/install/ssh.rs	Switches mock client mutexes to loom-aware mutex
crates/zeroclawed/src/install/health.rs	Switches mock health checker mutexes to loom-aware mutex
crates/zeroclawed/src/install/executor.rs	Switches to loom-aware Arc
crates/zeroclawed/src/delegation.rs	Adds delegation marker parsing + delegation engine (currently incomplete)
crates/zeroclawed/src/context.rs	Switches to loom-aware sync primitives; adjusts exchange_count internals
crates/zeroclawed/src/config/validator.rs	Adds config validation support + `--validate` plumbing
crates/zeroclawed/src/commands.rs	Extends `!model` to list/activate alloys and trigger local/provider switches
crates/zeroclawed/src/channels/whatsapp.rs	Adds post-auth `!model` handling
crates/zeroclawed/src/channels/telegram.rs	Adds post-auth `!model` handling
crates/zeroclawed/src/channels/signal.rs	Adds post-auth `!model` handling
crates/zeroclawed/src/channels/mod.rs	Registers new mock channel module
crates/zeroclawed/src/channels/mock.rs	Adds mock channel + control API server
crates/zeroclawed/src/auth.rs	Updates tests/config construction for new config fields
crates/zeroclawed/src/adapters/openclaw_native.rs	Plumbs model_override through DispatchContext in tests
crates/zeroclawed/src/adapters/openclaw_channel.rs	Switches to loom-aware sync primitives; passes model_override through
crates/zeroclawed/src/adapters/openclaw.rs	Adds model override support in OpenClaw + NZC webhook payload
crates/zeroclawed/src/adapters/nzc_native.rs	Plumbs model_override through to inner adapter
crates/zeroclawed/src/adapters/mod.rs	Adds `model_override` field to DispatchContext
crates/zeroclawed/src/adapters/cli.rs	Makes `/bin/false` test OS-path tolerant
crates/zeroclawed/src/adapters/acp.rs	Switches to loom-aware Arc (test)
crates/zeroclawed/Cargo.toml	Adds features for proxy integrations + sqlx; adds deps (rand/uuid/backon/sqlx)
README.md	Documents proxy, routing, local models, voice pipeline, tools manifest
Cargo.toml	Sets workspace default-members to exclude loom-tests by default

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-20T18:40:09Z

+curl -X POST http://localhost:8080/control/local/switch \
+  -H "Content-Type: application/json" \
+  -d '{"model_id": "gemma4-26b"}'
+```


The README’s example request body uses {"model_id": "..."}, but the handler expects a model field. Either update the handler to accept model_id (or both for compatibility) or fix the README example so users can successfully switch local models.

Copilot · 2026-04-20T18:40:10Z

+    // Use default control port (9090)
+    // TODO: Add control_port field to ChannelConfig
+    let control_port = 9090;
+


This channel docstring advertises a configurable control_port, but the implementation ignores config and always binds to 9090. Either add control_port (and test_users, if intended) to the channel config schema and read it here, or remove it from the docs to avoid a misleading configuration surface.

Copilot · 2026-04-20T18:40:10Z

+    use super::*;
+    use crate::config::PolyConfig;
+    use crate::proxy::auth::{check_model_access, model_matches, ProxyAccessPolicy};
+


This new test module does not compile against the current config/auth APIs: PolyConfig.proxy is an Option<ProxyConfig> (so config.proxy.default_policy won’t compile), check_model_access returns bool (not Result), and model_matches is private to proxy::auth. Update the tests to use ProxyConfig directly (like proxy/auth.rs tests) or expose the necessary APIs in proxy::auth.

Copilot · 2026-04-20T18:40:11Z

+        // Build context for delegation based on mode
+        let delegate_text = match delegation.context {
+            ContextMode::None => &delegation.message,
+            ContextMode::Recent => {
+                // Get recent context and prepend
+                let recent = context_store.get_recent(chat_key, 5).await;
+                if recent.is_empty() {
+                    &delegation.message
+                } else {
+                    // Build message with context preamble
+                    let preamble = recent.join("\n\n");
+                    let combined = format!("{}\n\n{}", preamble, delegation.message);
+                    // Store combined for this call
+                    // TODO: Properly handle lifetime here
+                    &delegation.message
+                }
+            }
+            ContextMode::Fork => {
+                // Fork: isolated context, just the message for now
+                &delegation.message
+            }
+        };
+
+        // Recursive call for delegation chain
+        self.dispatch_with_delegation(
+            delegate_text,


ContextStore has no get_recent(..) method (and this file currently builds a combined string but then discards it), so this won’t compile and delegation won’t include recent context as intended. Either add the needed API to the context store / unified context abstraction, or change this to use the existing context augmentation/preamble mechanism.

Suggested change

// Build context for delegation based on mode

let delegate_text = match delegation.context {

ContextMode::None => &delegation.message,

ContextMode::Recent => {

// Get recent context and prepend

let recent = context_store.get_recent(chat_key, 5).await;

if recent.is_empty() {

&delegation.message

} else {

// Build message with context preamble

let preamble = recent.join("\n\n");

let combined = format!("{}\n\n{}", preamble, delegation.message);

// Store combined for this call

// TODO: Properly handle lifetime here

&delegation.message

}

}

ContextMode::Fork => {

// Fork: isolated context, just the message for now

&delegation.message

}

};

// Recursive call for delegation chain

self.dispatch_with_delegation(

delegate_text,

// Build context for delegation based on mode.

//

// NOTE: Recent-context augmentation is not assembled here because the

// context store API used previously (`get_recent`) is not available in

// this file's abstraction. Until recent context is plumbed through the

// existing context augmentation/preamble mechanism, delegate the

// message payload directly.

let delegate_text = match delegation.context {

ContextMode::None => delegation.message.clone(),

ContextMode::Recent => delegation.message.clone(),

ContextMode::Fork => {

// Fork: isolated context, just the message for now

delegation.message.clone()

}

};

// Recursive call for delegation chain

self.dispatch_with_delegation(

&delegate_text,

Copilot · 2026-04-20T18:40:12Z

+    // Extract agent ID from header
+    let agent_id = headers
+        .get("x-agent-id")
+        .and_then(|h| h.to_str().ok())
+        .unwrap_or("anonymous");
+
+    // Check model access for this agent
+    if !crate::proxy::auth::check_model_access(&state.config, agent_id, &req.model) {
+        warn!(agent_id = %agent_id, model = %req.model, "Model access denied");
+        return api_error(
+            StatusCode::FORBIDDEN,
+            "model_access_denied",
+            &format!(
+                "Agent '{}' does not have access to model '{}'",
+                agent_id, req.model
+            ),
+            None,
+        );
+    }


/v1/chat/completions trusts the caller-provided x-agent-id header but does not enforce proxy.api_key (or any agent-specific API key) for authentication. As a result, a client can impersonate any agent and bypass model access controls. Add an Authorization check (global key and/or agent-specific keys) and derive agent_id from the authenticated key rather than from an untrusted header.

Adds /hooks/claude-code endpoint that speaks Claude Code's PreToolUse hookSpecificOutput format, a Starlark policy tuned for Claude Code tools, and scripts/setup-claude-hooks.sh to wire clashd as the policy engine for Claude Code (builds, installs, launchd service, settings.json update).

- retry_gateway: use retry_if predicate so 4xx/non-retryable errors are not retried (was computing should_retry but ignoring it) - persistent_context: remove unused Arc and Mutex imports - mock channel: remove advertised but unimplemented control_port config - README: fix model switch example (model_id -> model) - style: cargo fmt --all (clashd missed in previous fmt commit)

Copilot

Pull request overview

Copilot reviewed 60 out of 61 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-20T19:01:13Z

+            let choices = response_json["choices"]
+                .as_array()
+                .unwrap_or(&vec![])
+                .iter()
+                .map(|choice| {


unwrap_or(&vec![]) takes a reference to a temporary Vec, which will not compile (borrowed value does not live long enough). Use unwrap_or(&[]) (slice) or bind a local empty Vec/&[] fallback before iterating.

Copilot · 2026-04-20T19:01:13Z

+            let choices = response_json["choices"]
+                .as_array()
+                .unwrap_or(&vec![])
+                .iter()
+                .map(|choice| {


unwrap_or(&vec![]) takes a reference to a temporary Vec, which will not compile. Use unwrap_or(&[]) / unwrap_or_default() on an owned collection, or store an empty Vec in a local variable and reference that.

Copilot · 2026-04-20T19:01:14Z

+            let content = response_json["content"]
+                .as_array()
+                .unwrap_or(&vec![])
+                .iter()
+                .filter_map(|item| {


.unwrap_or(&vec![]) references a temporary Vec, which will not compile (borrowed value does not live long enough). Use a slice fallback (unwrap_or(&[])) or bind an empty Vec in a local variable before iterating.

Copilot · 2026-04-20T19:01:14Z

+    // TODO: Actually route the message through the system
+    // For now, just log it
+    debug!("Would route message from {}: {}", req.sender, req.text);
+
+    // Simulate a response
+    let response_text = format!("Mock response to: {}", req.text);


/send doesn’t route the message through the real system yet (it just logs and returns a hard-coded "Mock response"). That makes the mock channel unsuitable for integration testing as described; either implement dispatch via router/command_handler/context_store, or clearly scope this as a stub.

Copilot · 2026-04-20T19:01:14Z

+            let tool_calls = response_json["content"]
+                .as_array()
+                .unwrap_or(&vec![])
+                .iter()


.unwrap_or(&vec![]) again borrows a temporary Vec and won’t compile. Prefer unwrap_or(&[]) / a local empty collection to avoid referencing a temporary.

Copilot · 2026-04-20T19:01:15Z

+            // Thread 1: Register providers
+            let registry1 = registry.clone();
+            let p1 = provider1.clone();


Under cfg(all(test, loom)) these tests call registry.clone(), but ProviderRegistry does not implement Clone, so this won’t compile. Wrap ProviderRegistry in crate::sync::Arc (or implement Clone) for sharing across loom threads.

Copilot · 2026-04-20T19:01:15Z

+use std::sync::Arc;
+
+use async_trait::async_trait;
+use backon::{ExponentialBuilder, Retryable};
+use tracing::{info, warn};
+
+use crate::proxy::gateway::{BackendError, ChatCompletionRequest, ChatCompletionResponse, Gateway, ModelInfo};
+


This module is not referenced from proxy/mod.rs (no mod retry_gateway;), so it will never be compiled. As written, it also imports symbols that don’t exist in crate::proxy::gateway (e.g., Gateway, BackendError), so it’s likely stale/incomplete—either wire it up properly or remove it to avoid bitrot.

Copilot · 2026-04-20T19:01:15Z

+// Unit tests for auth module
+// These are the kinds of tests mutation testing works against
+
+#[cfg(test)]
+mod tests {


This file isn’t referenced by the module tree (no mod auth_test;), so it won’t be compiled or run. Consider moving these tests to crates/zeroclawed/tests/ (integration tests) or including the module from proxy/mod.rs under #[cfg(test)] so auth behavior is actually exercised.

Adds POST /hooks/zeroclaw-audit to receive zeroclaw webhook_audit fire-and-forget payloads — evaluates against policy and logs/warns on deny verdicts (monitoring only, zeroclaw does not read the response). Adds scripts/setup-agents.sh: detect-or-install opencode (brew), openclaw (npm), zeroclaw (brew) and wire clashd policy integration for each — zeroclaw webhook_audit, openclaw exec-approvals (restricted+ask), opencode plugin stub. Supports --configure-only, --install-only, --agents.

…ay naming security-gateway crate renamed to security-proxy (binary, package name, workspace deps, crate path). Default port changed 8080→8888 to avoid conflict with the model gateway (8080). Adds SECURITY_PROXY_PORT env var override consistent with CLASHD_PORT. Removes misleading "Alloy proxy/Alloy Model Proxy Server" labels from the model gateway (proxy/mod.rs, handlers.rs, backend.rs) — alloys are one routing feature; the gateway also includes Traceloop observability, retries, Helicone, multi-provider routing, and local model management.

Single entrypoint replacing setup-claude-hooks.sh + setup-agents.sh. Builds zeroclawed, clashd, security-proxy (release), installs to ~/.local/bin/, creates launchd services for both clashd and security-proxy. Wires clashd policy hooks for all four agents: Claude Code (PreToolUse hook), opencode (plugin stub), openclaw (exec-approvals), zeroclaw (webhook_audit + autonomy). Prompts before installing any missing tool (--yes skips prompts; --configure-only skips installs entirely). Correctly detects zeroclaw needing onboard before attempting service start.

…ters install.sh gains --nodes-file <path> and --nodes-only flags. For each node in the JSON config: cross-compile locally (cross/zigbuild) or fall back to building on the remote via SSH; rsync binary + policy files; install systemd service (Linux) or launchd plist (macOS). Adds deploy/nodes.example.json documenting the node config format (host, user, ssh_key, arch, os, services, install_dir, config_dir). Supports x86_64/aarch64 Linux and macOS nodes from one build machine.

- security-proxy: security_gateway → security_proxy in integration tests - adversary-detector: sort_by → sort_by_key - host-agent/pct: collapse nested if into match guard - zeroclawed: redundant field names (shorthand), manual checked_div, strip_suffix, collapsible if let in matrix.rs, useless vec!, assert! for bool, let-else for single-variant match in main.rs (extracted into_in_memory() method on UnifiedContextStore to avoid irrefutable pattern lint across feature configurations)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 70 out of 78 changed files in this pull request and generated 13 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-20T20:53:28Z

+[[proxy.providers]]
+id = "local"
+models = ["local/*", "llama*", "qwen*", "gemma*"]
+url = "http://localhost:8888/v1"
+
+[[proxy.providers]]
+id = "fast-provider"
+models = ["fast/*"]
+url = "https://api.fast-provider.example.com/v1"
+api_key_file = "/etc/zeroclawed/secrets/fast-key"


The README example uses model patterns like "llama*", "qwen*", and "gemma*", but the routing implementation only supports exact match, *, and prefix/* globs. As written, these patterns will never match and routing will fall back to the default backend. Either update the docs to use supported patterns (e.g. qwen/*) or extend model_matches_pattern to support * suffix wildcards.

Copilot · 2026-04-20T20:53:28Z

+[[local_models.models]]
+id = "qwen3-35b"
+path = "~/.cache/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-MLX-8bit"
+backend = "mlx_lm"
+port = 8888
+
+[[local_models.models]]
+id = "gemma4-26b"
+path = "~/.cache/huggingface/hub/models--unsloth--gemma-4-26b-a4b-it-MLX-8bit"
+backend = "mlx_lm"
+port = 8888


The [local_models] README example doesn’t match the actual config schema: [[local_models.models]] expects hf_id (and optional provider_type/display_name), while port/host are configured under [local_models.mlx_lm]. The example currently uses path, backend, and per-model port, which won’t deserialize. Please align the documentation with LocalModelsConfig/LocalModelDef.

Suggested change

[[local_models.models]]

id = "qwen3-35b"

path = "~/.cache/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-MLX-8bit"

backend = "mlx_lm"

port = 8888

[[local_models.models]]

id = "gemma4-26b"

path = "~/.cache/huggingface/hub/models--unsloth--gemma-4-26b-a4b-it-MLX-8bit"

backend = "mlx_lm"

port = 8888

[local_models.mlx_lm]

host = "127.0.0.1"

port = 8888

[[local_models.models]]

id = "qwen3-35b"

hf_id = "unsloth/Qwen3.6-35B-A3B-MLX-8bit"

provider_type = "mlx_lm"

display_name = "Qwen3 35B"

[[local_models.models]]

id = "gemma4-26b"

hf_id = "unsloth/gemma-4-26b-a4b-it-MLX-8bit"

provider_type = "mlx_lm"

display_name = "Gemma 4 26B"

Copilot · 2026-04-20T20:53:28Z

+    #[tokio::test]
+    async fn test_caching_behavior() {
+        // Create a mock router with a single provider
+        let _router = TraceloopRouter::new(vec![ProviderConfig {
+            id: "test".to_string(),
+            r#type: ProviderType::OpenAI,
+            api_key: "test-key".to_string(),
+            base_url: Some("http://localhost:9999".to_string()),
+            default_model: "test-model".to_string(),
+        }])
+        .unwrap();
+
+        let _messages = [ChatMessage {
+            role: "user".to_string(),
+            content: Some(MessageContent::Text("Test message".to_string())),
+            name: None,
+            tool_calls: None,
+            tool_call_id: None,
+            reasoning: None,
+            reasoning_content: None,
+        }]; // test only verifies code compiles (mock provider not running)
+
+        // First request should miss cache
+        // Note: This will fail because the mock provider can't connect to localhost:9999
+        // In a real test, we'd use a mock HTTP server
+        // For now, we just verify the code compiles and the caching logic is sound
+        println!("Caching test would run here with proper mock setup");
+    }


This test module doesn’t currently test caching behavior: test_caching_behavior contains no assertions and just prints a message. Either convert it into a real test using a mock HTTP server (e.g. mockito) to assert cache hits/misses, or remove/mark it clearly as a placeholder so it doesn’t give a false sense of coverage.

Copilot · 2026-04-20T20:53:29Z

+// Unit tests for auth module
+// These are the kinds of tests mutation testing works against
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::config::PolyConfig;
+    use crate::proxy::auth::{check_model_access, model_matches, ProxyAccessPolicy};
+
+    #[test]
+    fn test_model_matches_exact() {
+        // Exact match should work
+        assert!(model_matches("deepseek-chat", "deepseek-chat"));
+        assert!(model_matches("kimi/kimi-for-coding", "kimi/kimi-for-coding"));


crates/zeroclawed/src/proxy/auth_test.rs appears to be an orphaned test file: it isn’t referenced from the module tree, and it uses an old check_model_access(...).is_ok()/is_err() API that doesn’t match the current bool return type. Consider either wiring it in via #[cfg(test)] mod auth_test; (after updating it to compile) or deleting it to avoid confusion and bitrot.

Copilot · 2026-04-20T20:53:29Z

+    let context = serde_json::json!({
+        "agent_id": "claude-code",
+        "cwd": req.cwd.unwrap_or_default(),
+        "session_id": req.session_id.unwrap_or_default(),
+    });
+
+    let result = state
+        .engine
+        .evaluate(&req.tool_name, &req.tool_input, Some("claude-code"))
+        .await;
+
+    let (decision, reason) = match result.verdict.to_string().as_str() {
+        "deny" => ("deny", result.reason),
+        "review" => ("ask", result.reason),
+        _ => ("allow", None),
+    };
+
+    info!(
+        tool = %req.tool_name,
+        decision = %decision,
+        reason = ?reason,
+        "claude-code hook evaluated"
+    );
+
+    // Suppress unused variable warning — context available for future use
+    let _ = context;
+


This handler builds a context object with cwd/session_id, but it isn’t passed into PolicyEngine::evaluate (the engine constructs its own context), so cwd will always be empty in Starlark policies. Either remove the unused context construction, or extend PolicyEngine::evaluate to accept and merge additional context fields so policies like claude-code-policy.star can actually use cwd/session_id.

Copilot · 2026-04-20T20:53:30Z

+
+            // Thread 1: Register providers
+            let registry1 = registry.clone();
+            let p1 = provider1.clone();
+            let p2 = provider2.clone();
+            let t1 = thread::spawn(move || {
+                registry1.register(p1);
+                registry1.register(p2);
+            });
+
+            // Thread 2: Read while writing
+            let registry2 = registry.clone();
+            let t2 = thread::spawn(move || {
+                // This might read empty or partial state
+                let _ids = registry2.list_ids();
+                let _provider = registry2.get("provider1");
+            });


The Loom tests call registry.clone(), but ProviderRegistry doesn’t implement Clone, so this will not compile under cfg(loom). Consider wrapping the registry in crate::sync::Arc inside the Loom test (and cloning the Arc), or implement Clone for ProviderRegistry by storing the inner map behind an Arc<RwLock<...>>.

Copilot · 2026-04-20T20:53:30Z

+    let result = tokio::task::spawn_blocking({
+        let path = path.to_string();
+        let input = input.clone();
+        move || -> Result<Vec<u8>> {
+            use std::io::Write;
+            use std::process::{Command, Stdio};
+
+            let mut child = Command::new(&path)
+                .stdin(Stdio::piped())
+                .stdout(Stdio::piped())
+                .stderr(Stdio::piped())
+                .spawn()
+                .with_context(|| format!("failed to spawn hook {path}"))?;
+
+            child
+                .stdin
+                .take()
+                .unwrap()
+                .write_all(&input)
+                .context("writing hook stdin")?;
+
+            let out = child.wait_with_output().context("waiting for hook")?;
+
+            if !out.stderr.is_empty() {
+                let msg = String::from_utf8_lossy(&out.stderr);
+                warn!(hook = %path, stderr = %msg.trim(), "hook stderr");
+            }
+
+            if out.status.success() {
+                Ok(out.stdout)
+            } else {
+                anyhow::bail!("hook exited with status {}", out.status);
+            }
+        }
+    })
+    .await;


run_hook spawns an external process and waits for it to finish, but there’s no timeout/kill logic. A hung hook will stall the request indefinitely even though the comment implies bounded time. Consider enforcing a timeout (e.g., tokio::time::timeout around spawn_blocking and/or killing the child after a deadline) so voice requests can fail fast or degrade gracefully.

Copilot · 2026-04-20T20:53:30Z

+    let client = reqwest::Client::builder()
+        .timeout(Duration::from_secs(config.timeout_seconds))
+        .build()
+        .context("building reqwest client")?;


A new reqwest::Client is constructed for every STT/TTS request. Creating clients is relatively expensive (connection pools, TLS config) and can become a throughput bottleneck. Consider reusing a single client (e.g., store it in ProxyState or cache per-upstream config) rather than rebuilding per call.

Copilot · 2026-04-20T20:53:31Z

+"$CARGO" build --release -p clashd -p zeroclawed 2>&1 \
+    | grep -E "^error|Compiling|Finished" || true
+ok "Build complete"


The cargo build pipeline masks build failures (... | grep ... || true) even though set -euo pipefail is enabled. If the build fails, the script will still print "Build complete" and then cp non-existent binaries, producing confusing errors. Consider removing the || true (and/or using grep ... || { cat; exit 1; }) so the script aborts on build failure while still filtering output.

Copilot · 2026-04-20T20:53:31Z

+    let unified_context_store = UnifiedContextStore::new(
+        config.context.buffer_size,
+        config.context.inject_depth,
+        config.context.persistent.as_ref(),
+    )
+    .await?;
+
+    let context_store_arc = unified_context_store.into_in_memory()?;
+
+    // Clone the inner ContextStore for channel functions
+    let context_store = (*context_store_arc).clone();


UnifiedContextStore::new(...) can return a persistent store when persistent-context is enabled and [context.persistent] is configured, but main immediately calls into_in_memory()?, which will bail for the persistent variant. This makes the persistent-context feature unusable in practice. Consider plumbing UnifiedContextStore (or a dyn ContextStoreTrait) through channels/CommandHandler instead of forcing an in-memory ContextStore clone.

- Delete three orphaned files never wired into the module tree (retry_gateway.rs, auth_test.rs, delegation.rs) - Read mock channel control_port from config (was hardcoded 9090) - Document mock /send as an intentional stub with clear scope note - Remove unused context object in clashd claude-code hook; log cwd/session_id directly in the tracing fields instead - Fix setup-claude-hooks.sh: propagate cargo build failures instead of masking with || true; gate launchctl section on macOS - Share a single reqwest::Client across voice forward calls (OnceLock) with per-request timeout; add 30s timeout to run_hook subprocess - Stop sending "Bearer no-key" to unauthenticated HTTP backends: pass None through BackendConfig and skip Authorization header when api_key is empty - Fix README: model routing patterns (llama* → llama/*), local_models schema (path/backend/port → hf_id/provider_type + mlx_lm section) - Add comment explaining persistent-context feature is intentionally deferred until plumbed through all channel/handler call sites Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…voice pipeline (#10) * feat: add AI model proxy, local model lifecycle, multi-provider routing, and voice pipeline Adds the following new capabilities on top of main: - OpenAI-compatible HTTP proxy server ([proxy] config) - Multi-provider routing via [[proxy.providers]] and [[proxy.model_routes]] - Local model lifecycle management ([local_models] config) with mlx_lm/llama.cpp support and hot-swap via POST /control/local/switch - Voice pipeline passthrough: POST /v1/audio/transcriptions, /v1/audio/speech, GET /v1/tools/manifest with optional shell hooks - Model alloy blending ([[alloys]]) with weighted/round_robin strategies - Traceloop and Helicone gateway integrations - Matrix channel rewritten with raw HTTP (removes matrix-sdk dependency) - Config validator (--validate flag) - Mock channel for testing - Persistent context store (optional feature) - Agent delegation and slash command interception scaffold - exclude loom-tests from default cargo test via workspace default-members * style: apply cargo fmt across zeroclawed and clashd * feat(clashd): add Claude Code hook endpoint, policy, and setup script Adds /hooks/claude-code endpoint that speaks Claude Code's PreToolUse hookSpecificOutput format, a Starlark policy tuned for Claude Code tools, and scripts/setup-claude-hooks.sh to wire clashd as the policy engine for Claude Code (builds, installs, launchd service, settings.json update). * fix: address AI review feedback - retry_gateway: use retry_if predicate so 4xx/non-retryable errors are not retried (was computing should_retry but ignoring it) - persistent_context: remove unused Arc and Mutex imports - mock channel: remove advertised but unimplemented control_port config - README: fix model switch example (model_id -> model) - style: cargo fmt --all (clashd missed in previous fmt commit) * feat(clashd): add zeroclaw audit hook endpoint and agent setup script Adds POST /hooks/zeroclaw-audit to receive zeroclaw webhook_audit fire-and-forget payloads — evaluates against policy and logs/warns on deny verdicts (monitoring only, zeroclaw does not read the response). Adds scripts/setup-agents.sh: detect-or-install opencode (brew), openclaw (npm), zeroclaw (brew) and wire clashd policy integration for each — zeroclaw webhook_audit, openclaw exec-approvals (restricted+ask), opencode plugin stub. Supports --configure-only, --install-only, --agents. * refactor: rename security-gateway→security-proxy, clarify model gateway naming security-gateway crate renamed to security-proxy (binary, package name, workspace deps, crate path). Default port changed 8080→8888 to avoid conflict with the model gateway (8080). Adds SECURITY_PROXY_PORT env var override consistent with CLASHD_PORT. Removes misleading "Alloy proxy/Alloy Model Proxy Server" labels from the model gateway (proxy/mod.rs, handlers.rs, backend.rs) — alloys are one routing feature; the gateway also includes Traceloop observability, retries, Helicone, multi-provider routing, and local model management. * feat: add unified install.sh — builds all binaries, wires all agents Single entrypoint replacing setup-claude-hooks.sh + setup-agents.sh. Builds zeroclawed, clashd, security-proxy (release), installs to ~/.local/bin/, creates launchd services for both clashd and security-proxy. Wires clashd policy hooks for all four agents: Claude Code (PreToolUse hook), opencode (plugin stub), openclaw (exec-approvals), zeroclaw (webhook_audit + autonomy). Prompts before installing any missing tool (--yes skips prompts; --configure-only skips installs entirely). Correctly detects zeroclaw needing onboard before attempting service start. * feat(install): add multi-node SSH deployment for Proxmox/homelab clusters install.sh gains --nodes-file <path> and --nodes-only flags. For each node in the JSON config: cross-compile locally (cross/zigbuild) or fall back to building on the remote via SSH; rsync binary + policy files; install systemd service (Linux) or launchd plist (macOS). Adds deploy/nodes.example.json documenting the node config format (host, user, ssh_key, arch, os, services, install_dir, config_dir). Supports x86_64/aarch64 Linux and macOS nodes from one build machine. * fix: resolve all clippy and fmt failures from CI - security-proxy: security_gateway → security_proxy in integration tests - adversary-detector: sort_by → sort_by_key - host-agent/pct: collapse nested if into match guard - zeroclawed: redundant field names (shorthand), manual checked_div, strip_suffix, collapsible if let in matrix.rs, useless vec!, assert! for bool, let-else for single-variant match in main.rs (extracted into_in_memory() method on UnifiedContextStore to avoid irrefutable pattern lint across feature configurations) * fix: rename security-gateway to security-proxy in CI workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address all active Copilot review comments - Delete three orphaned files never wired into the module tree (retry_gateway.rs, auth_test.rs, delegation.rs) - Read mock channel control_port from config (was hardcoded 9090) - Document mock /send as an intentional stub with clear scope note - Remove unused context object in clashd claude-code hook; log cwd/session_id directly in the tracing fields instead - Fix setup-claude-hooks.sh: propagate cargo build failures instead of masking with || true; gate launchctl section on macOS - Share a single reqwest::Client across voice forward calls (OnceLock) with per-request timeout; add 30s timeout to run_hook subprocess - Stop sending "Bearer no-key" to unauthenticated HTTP backends: pass None through BackendConfig and skip Authorization header when api_key is empty - Fix README: model routing patterns (llama* → llama/*), local_models schema (path/backend/port → hf_id/provider_type + mlx_lm section) - Add comment explaining persistent-context feature is intentionally deferred until plumbed through all channel/handler call sites Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 20, 2026 18:34

Copilot started reviewing on behalf of bglusman April 20, 2026 18:35 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

bglusman added 3 commits April 20, 2026 14:51

style: apply cargo fmt across zeroclawed and clashd

8fb01bf

Copilot AI review requested due to automatic review settings April 20, 2026 18:56

Copilot started reviewing on behalf of bglusman April 20, 2026 18:56 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

bglusman and others added 6 commits April 20, 2026 15:19

fix: rename security-gateway to security-proxy in CI workflow

a4fb90b

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 20, 2026 20:46

Copilot started reviewing on behalf of bglusman April 20, 2026 20:46 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

bglusman merged commit 64c4fd2 into main Apr 21, 2026
13 checks passed

bglusman mentioned this pull request Apr 21, 2026

feat: agent delegation design + slash command interception #7

Closed

15 tasks

bglusman deleted the rebase/proxy-voice-onto-main branch May 1, 2026 17:21

Conversation

bglusman commented Apr 20, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026