feat: AI model proxy, local model lifecycle, multi-provider routing, voice pipeline#10
feat: AI model proxy, local model lifecycle, multi-provider routing, voice pipeline#10
Conversation
…ng, and voice pipeline Adds the following new capabilities on top of main: - OpenAI-compatible HTTP proxy server ([proxy] config) - Multi-provider routing via [[proxy.providers]] and [[proxy.model_routes]] - Local model lifecycle management ([local_models] config) with mlx_lm/llama.cpp support and hot-swap via POST /control/local/switch - Voice pipeline passthrough: POST /v1/audio/transcriptions, /v1/audio/speech, GET /v1/tools/manifest with optional shell hooks - Model alloy blending ([[alloys]]) with weighted/round_robin strategies - Traceloop and Helicone gateway integrations - Matrix channel rewritten with raw HTTP (removes matrix-sdk dependency) - Config validator (--validate flag) - Mock channel for testing - Persistent context store (optional feature) - Agent delegation and slash command interception scaffold - exclude loom-tests from default cargo test via workspace default-members
There was a problem hiding this comment.
Pull request overview
Adds a substantial OpenAI-compatible proxy/voice surface to zeroclawed, plus infrastructure for multi-provider routing, local model lifecycle management, and (feature-gated) persistent conversation context.
Changes:
- Introduces a new proxy server (
/v1/chat/completions,/v1/models,/health) with provider routing, retries, and optional Helicone/Traceloop integrations. - Adds local model lifecycle management + a voice passthrough layer (STT/TTS forwarding + tool manifest endpoint).
- Adds a unified context abstraction and a new persistent SQLite-backed context store (feature-gated).
Reviewed changes
Copilot reviewed 57 out of 58 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/zeroclawed/src/voice/tools.rs | Builds tool manifest JSON based on configured capabilities |
| crates/zeroclawed/src/voice/mod.rs | Voice config schema for STT/TTS + hooks |
| crates/zeroclawed/src/voice/forward.rs | Generic HTTP forwarder + hook runner for voice endpoints |
| crates/zeroclawed/src/unified_context.rs | Unifies in-memory and persistent context store behind a trait/enum |
| crates/zeroclawed/src/sync.rs | Loom-aware reexports for sync primitives |
| crates/zeroclawed/src/router.rs | Adds dispatch API supporting optional model override |
| crates/zeroclawed/src/proxy/voice_handlers.rs | Axum handlers for STT/TTS passthrough + tools manifest |
| crates/zeroclawed/src/proxy/traceloop/test.rs | Adds Traceloop router unit tests (partial/placeholder) |
| crates/zeroclawed/src/proxy/traceloop/openai.rs | Traceloop OpenAI-provider implementation |
| crates/zeroclawed/src/proxy/traceloop/kimi.rs | Kimi provider wrapper via OpenAI-compatible API |
| crates/zeroclawed/src/proxy/traceloop/deepseek.rs | DeepSeek provider implementation |
| crates/zeroclawed/src/proxy/traceloop/anthropic.rs | Anthropic provider implementation + response conversion |
| crates/zeroclawed/src/proxy/streaming.rs | Stub for SSE stream transformation (currently empty) |
| crates/zeroclawed/src/proxy/routing.rs | Builds provider routing table from config patterns |
| crates/zeroclawed/src/proxy/retry_gateway.rs | Adds a retrying gateway wrapper (configurable policy) |
| crates/zeroclawed/src/proxy/retry.rs | Defines retry configuration + backoff iterator |
| crates/zeroclawed/src/proxy/openai.rs | Adds OpenAI-compatible request/response types |
| crates/zeroclawed/src/proxy/mod.rs | Proxy server bootstrapping + routes registration |
| crates/zeroclawed/src/proxy/helicone_router.rs | HTTP-based Helicone gateway integration |
| crates/zeroclawed/src/proxy/handlers.rs | Core proxy handlers: chat completions, models, health, local switch |
| crates/zeroclawed/src/proxy/auth_test.rs | Adds auth tests (currently API-mismatched) |
| crates/zeroclawed/src/proxy/auth.rs | Model access policy evaluation (auth scaffolding partially commented out) |
| crates/zeroclawed/src/proxy/alloy_router.rs | Back-compat alloy router wrapper around Helicone router |
| crates/zeroclawed/src/providers/mod.rs | Introduces provider registry + loom tests |
| crates/zeroclawed/src/providers/alloy.rs | Implements alloy selection strategies and stats |
| crates/zeroclawed/src/persistent_context.rs | Adds SQLite persistent context store (sqlx) |
| crates/zeroclawed/src/main.rs | Adds --validate, --proxy-only, proxy startup, local model autoload, unified context init |
| crates/zeroclawed/src/local_model/mod.rs | Adds local model manager and switch hooks |
| crates/zeroclawed/src/local_model/mlx_lm.rs | Spawns/stops mlx_lm.server and waits for readiness |
| crates/zeroclawed/src/install/ssh.rs | Switches mock client mutexes to loom-aware mutex |
| crates/zeroclawed/src/install/health.rs | Switches mock health checker mutexes to loom-aware mutex |
| crates/zeroclawed/src/install/executor.rs | Switches to loom-aware Arc |
| crates/zeroclawed/src/delegation.rs | Adds delegation marker parsing + delegation engine (currently incomplete) |
| crates/zeroclawed/src/context.rs | Switches to loom-aware sync primitives; adjusts exchange_count internals |
| crates/zeroclawed/src/config/validator.rs | Adds config validation support + --validate plumbing |
| crates/zeroclawed/src/commands.rs | Extends !model to list/activate alloys and trigger local/provider switches |
| crates/zeroclawed/src/channels/whatsapp.rs | Adds post-auth !model handling |
| crates/zeroclawed/src/channels/telegram.rs | Adds post-auth !model handling |
| crates/zeroclawed/src/channels/signal.rs | Adds post-auth !model handling |
| crates/zeroclawed/src/channels/mod.rs | Registers new mock channel module |
| crates/zeroclawed/src/channels/mock.rs | Adds mock channel + control API server |
| crates/zeroclawed/src/auth.rs | Updates tests/config construction for new config fields |
| crates/zeroclawed/src/adapters/openclaw_native.rs | Plumbs model_override through DispatchContext in tests |
| crates/zeroclawed/src/adapters/openclaw_channel.rs | Switches to loom-aware sync primitives; passes model_override through |
| crates/zeroclawed/src/adapters/openclaw.rs | Adds model override support in OpenClaw + NZC webhook payload |
| crates/zeroclawed/src/adapters/nzc_native.rs | Plumbs model_override through to inner adapter |
| crates/zeroclawed/src/adapters/mod.rs | Adds model_override field to DispatchContext |
| crates/zeroclawed/src/adapters/cli.rs | Makes /bin/false test OS-path tolerant |
| crates/zeroclawed/src/adapters/acp.rs | Switches to loom-aware Arc (test) |
| crates/zeroclawed/Cargo.toml | Adds features for proxy integrations + sqlx; adds deps (rand/uuid/backon/sqlx) |
| README.md | Documents proxy, routing, local models, voice pipeline, tools manifest |
| Cargo.toml | Sets workspace default-members to exclude loom-tests by default |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| curl -X POST http://localhost:8080/control/local/switch \ | ||
| -H "Content-Type: application/json" \ | ||
| -d '{"model_id": "gemma4-26b"}' | ||
| ``` |
There was a problem hiding this comment.
The README’s example request body uses {"model_id": "..."}, but the handler expects a model field. Either update the handler to accept model_id (or both for compatibility) or fix the README example so users can successfully switch local models.
| // Use default control port (9090) | ||
| // TODO: Add control_port field to ChannelConfig | ||
| let control_port = 9090; | ||
|
|
There was a problem hiding this comment.
This channel docstring advertises a configurable control_port, but the implementation ignores config and always binds to 9090. Either add control_port (and test_users, if intended) to the channel config schema and read it here, or remove it from the docs to avoid a misleading configuration surface.
| use super::*; | ||
| use crate::config::PolyConfig; | ||
| use crate::proxy::auth::{check_model_access, model_matches, ProxyAccessPolicy}; | ||
|
|
There was a problem hiding this comment.
This new test module does not compile against the current config/auth APIs: PolyConfig.proxy is an Option<ProxyConfig> (so config.proxy.default_policy won’t compile), check_model_access returns bool (not Result), and model_matches is private to proxy::auth. Update the tests to use ProxyConfig directly (like proxy/auth.rs tests) or expose the necessary APIs in proxy::auth.
| // Build context for delegation based on mode | ||
| let delegate_text = match delegation.context { | ||
| ContextMode::None => &delegation.message, | ||
| ContextMode::Recent => { | ||
| // Get recent context and prepend | ||
| let recent = context_store.get_recent(chat_key, 5).await; | ||
| if recent.is_empty() { | ||
| &delegation.message | ||
| } else { | ||
| // Build message with context preamble | ||
| let preamble = recent.join("\n\n"); | ||
| let combined = format!("{}\n\n{}", preamble, delegation.message); | ||
| // Store combined for this call | ||
| // TODO: Properly handle lifetime here | ||
| &delegation.message | ||
| } | ||
| } | ||
| ContextMode::Fork => { | ||
| // Fork: isolated context, just the message for now | ||
| &delegation.message | ||
| } | ||
| }; | ||
|
|
||
| // Recursive call for delegation chain | ||
| self.dispatch_with_delegation( | ||
| delegate_text, |
There was a problem hiding this comment.
ContextStore has no get_recent(..) method (and this file currently builds a combined string but then discards it), so this won’t compile and delegation won’t include recent context as intended. Either add the needed API to the context store / unified context abstraction, or change this to use the existing context augmentation/preamble mechanism.
| // Build context for delegation based on mode | |
| let delegate_text = match delegation.context { | |
| ContextMode::None => &delegation.message, | |
| ContextMode::Recent => { | |
| // Get recent context and prepend | |
| let recent = context_store.get_recent(chat_key, 5).await; | |
| if recent.is_empty() { | |
| &delegation.message | |
| } else { | |
| // Build message with context preamble | |
| let preamble = recent.join("\n\n"); | |
| let combined = format!("{}\n\n{}", preamble, delegation.message); | |
| // Store combined for this call | |
| // TODO: Properly handle lifetime here | |
| &delegation.message | |
| } | |
| } | |
| ContextMode::Fork => { | |
| // Fork: isolated context, just the message for now | |
| &delegation.message | |
| } | |
| }; | |
| // Recursive call for delegation chain | |
| self.dispatch_with_delegation( | |
| delegate_text, | |
| // Build context for delegation based on mode. | |
| // | |
| // NOTE: Recent-context augmentation is not assembled here because the | |
| // context store API used previously (`get_recent`) is not available in | |
| // this file's abstraction. Until recent context is plumbed through the | |
| // existing context augmentation/preamble mechanism, delegate the | |
| // message payload directly. | |
| let delegate_text = match delegation.context { | |
| ContextMode::None => delegation.message.clone(), | |
| ContextMode::Recent => delegation.message.clone(), | |
| ContextMode::Fork => { | |
| // Fork: isolated context, just the message for now | |
| delegation.message.clone() | |
| } | |
| }; | |
| // Recursive call for delegation chain | |
| self.dispatch_with_delegation( | |
| &delegate_text, |
| // Extract agent ID from header | ||
| let agent_id = headers | ||
| .get("x-agent-id") | ||
| .and_then(|h| h.to_str().ok()) | ||
| .unwrap_or("anonymous"); | ||
|
|
||
| // Check model access for this agent | ||
| if !crate::proxy::auth::check_model_access(&state.config, agent_id, &req.model) { | ||
| warn!(agent_id = %agent_id, model = %req.model, "Model access denied"); | ||
| return api_error( | ||
| StatusCode::FORBIDDEN, | ||
| "model_access_denied", | ||
| &format!( | ||
| "Agent '{}' does not have access to model '{}'", | ||
| agent_id, req.model | ||
| ), | ||
| None, | ||
| ); | ||
| } |
There was a problem hiding this comment.
/v1/chat/completions trusts the caller-provided x-agent-id header but does not enforce proxy.api_key (or any agent-specific API key) for authentication. As a result, a client can impersonate any agent and bypass model access controls. Add an Authorization check (global key and/or agent-specific keys) and derive agent_id from the authenticated key rather than from an untrusted header.
Adds /hooks/claude-code endpoint that speaks Claude Code's PreToolUse hookSpecificOutput format, a Starlark policy tuned for Claude Code tools, and scripts/setup-claude-hooks.sh to wire clashd as the policy engine for Claude Code (builds, installs, launchd service, settings.json update).
- retry_gateway: use retry_if predicate so 4xx/non-retryable errors are not retried (was computing should_retry but ignoring it) - persistent_context: remove unused Arc and Mutex imports - mock channel: remove advertised but unimplemented control_port config - README: fix model switch example (model_id -> model) - style: cargo fmt --all (clashd missed in previous fmt commit)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 60 out of 61 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let choices = response_json["choices"] | ||
| .as_array() | ||
| .unwrap_or(&vec![]) | ||
| .iter() | ||
| .map(|choice| { |
There was a problem hiding this comment.
unwrap_or(&vec![]) takes a reference to a temporary Vec, which will not compile (borrowed value does not live long enough). Use unwrap_or(&[]) (slice) or bind a local empty Vec/&[] fallback before iterating.
| let choices = response_json["choices"] | ||
| .as_array() | ||
| .unwrap_or(&vec![]) | ||
| .iter() | ||
| .map(|choice| { |
There was a problem hiding this comment.
unwrap_or(&vec![]) takes a reference to a temporary Vec, which will not compile. Use unwrap_or(&[]) / unwrap_or_default() on an owned collection, or store an empty Vec in a local variable and reference that.
| let content = response_json["content"] | ||
| .as_array() | ||
| .unwrap_or(&vec![]) | ||
| .iter() | ||
| .filter_map(|item| { |
There was a problem hiding this comment.
.unwrap_or(&vec![]) references a temporary Vec, which will not compile (borrowed value does not live long enough). Use a slice fallback (unwrap_or(&[])) or bind an empty Vec in a local variable before iterating.
| // TODO: Actually route the message through the system | ||
| // For now, just log it | ||
| debug!("Would route message from {}: {}", req.sender, req.text); | ||
|
|
||
| // Simulate a response | ||
| let response_text = format!("Mock response to: {}", req.text); |
There was a problem hiding this comment.
/send doesn’t route the message through the real system yet (it just logs and returns a hard-coded "Mock response"). That makes the mock channel unsuitable for integration testing as described; either implement dispatch via router/command_handler/context_store, or clearly scope this as a stub.
| let tool_calls = response_json["content"] | ||
| .as_array() | ||
| .unwrap_or(&vec![]) | ||
| .iter() |
There was a problem hiding this comment.
.unwrap_or(&vec![]) again borrows a temporary Vec and won’t compile. Prefer unwrap_or(&[]) / a local empty collection to avoid referencing a temporary.
| // Thread 1: Register providers | ||
| let registry1 = registry.clone(); | ||
| let p1 = provider1.clone(); |
There was a problem hiding this comment.
Under cfg(all(test, loom)) these tests call registry.clone(), but ProviderRegistry does not implement Clone, so this won’t compile. Wrap ProviderRegistry in crate::sync::Arc (or implement Clone) for sharing across loom threads.
| use std::sync::Arc; | ||
|
|
||
| use async_trait::async_trait; | ||
| use backon::{ExponentialBuilder, Retryable}; | ||
| use tracing::{info, warn}; | ||
|
|
||
| use crate::proxy::gateway::{BackendError, ChatCompletionRequest, ChatCompletionResponse, Gateway, ModelInfo}; | ||
|
|
There was a problem hiding this comment.
This module is not referenced from proxy/mod.rs (no mod retry_gateway;), so it will never be compiled. As written, it also imports symbols that don’t exist in crate::proxy::gateway (e.g., Gateway, BackendError), so it’s likely stale/incomplete—either wire it up properly or remove it to avoid bitrot.
| // Unit tests for auth module | ||
| // These are the kinds of tests mutation testing works against | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
This file isn’t referenced by the module tree (no mod auth_test;), so it won’t be compiled or run. Consider moving these tests to crates/zeroclawed/tests/ (integration tests) or including the module from proxy/mod.rs under #[cfg(test)] so auth behavior is actually exercised.
Adds POST /hooks/zeroclaw-audit to receive zeroclaw webhook_audit fire-and-forget payloads — evaluates against policy and logs/warns on deny verdicts (monitoring only, zeroclaw does not read the response). Adds scripts/setup-agents.sh: detect-or-install opencode (brew), openclaw (npm), zeroclaw (brew) and wire clashd policy integration for each — zeroclaw webhook_audit, openclaw exec-approvals (restricted+ask), opencode plugin stub. Supports --configure-only, --install-only, --agents.
…ay naming security-gateway crate renamed to security-proxy (binary, package name, workspace deps, crate path). Default port changed 8080→8888 to avoid conflict with the model gateway (8080). Adds SECURITY_PROXY_PORT env var override consistent with CLASHD_PORT. Removes misleading "Alloy proxy/Alloy Model Proxy Server" labels from the model gateway (proxy/mod.rs, handlers.rs, backend.rs) — alloys are one routing feature; the gateway also includes Traceloop observability, retries, Helicone, multi-provider routing, and local model management.
Single entrypoint replacing setup-claude-hooks.sh + setup-agents.sh. Builds zeroclawed, clashd, security-proxy (release), installs to ~/.local/bin/, creates launchd services for both clashd and security-proxy. Wires clashd policy hooks for all four agents: Claude Code (PreToolUse hook), opencode (plugin stub), openclaw (exec-approvals), zeroclaw (webhook_audit + autonomy). Prompts before installing any missing tool (--yes skips prompts; --configure-only skips installs entirely). Correctly detects zeroclaw needing onboard before attempting service start.
…ters install.sh gains --nodes-file <path> and --nodes-only flags. For each node in the JSON config: cross-compile locally (cross/zigbuild) or fall back to building on the remote via SSH; rsync binary + policy files; install systemd service (Linux) or launchd plist (macOS). Adds deploy/nodes.example.json documenting the node config format (host, user, ssh_key, arch, os, services, install_dir, config_dir). Supports x86_64/aarch64 Linux and macOS nodes from one build machine.
- security-proxy: security_gateway → security_proxy in integration tests - adversary-detector: sort_by → sort_by_key - host-agent/pct: collapse nested if into match guard - zeroclawed: redundant field names (shorthand), manual checked_div, strip_suffix, collapsible if let in matrix.rs, useless vec!, assert! for bool, let-else for single-variant match in main.rs (extracted into_in_memory() method on UnifiedContextStore to avoid irrefutable pattern lint across feature configurations)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 70 out of 78 changed files in this pull request and generated 13 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [[proxy.providers]] | ||
| id = "local" | ||
| models = ["local/*", "llama*", "qwen*", "gemma*"] | ||
| url = "http://localhost:8888/v1" | ||
|
|
||
| [[proxy.providers]] | ||
| id = "fast-provider" | ||
| models = ["fast/*"] | ||
| url = "https://api.fast-provider.example.com/v1" | ||
| api_key_file = "/etc/zeroclawed/secrets/fast-key" |
There was a problem hiding this comment.
The README example uses model patterns like "llama*", "qwen*", and "gemma*", but the routing implementation only supports exact match, *, and prefix/* globs. As written, these patterns will never match and routing will fall back to the default backend. Either update the docs to use supported patterns (e.g. qwen/*) or extend model_matches_pattern to support * suffix wildcards.
| [[local_models.models]] | ||
| id = "qwen3-35b" | ||
| path = "~/.cache/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-MLX-8bit" | ||
| backend = "mlx_lm" | ||
| port = 8888 | ||
|
|
||
| [[local_models.models]] | ||
| id = "gemma4-26b" | ||
| path = "~/.cache/huggingface/hub/models--unsloth--gemma-4-26b-a4b-it-MLX-8bit" | ||
| backend = "mlx_lm" | ||
| port = 8888 |
There was a problem hiding this comment.
The [local_models] README example doesn’t match the actual config schema: [[local_models.models]] expects hf_id (and optional provider_type/display_name), while port/host are configured under [local_models.mlx_lm]. The example currently uses path, backend, and per-model port, which won’t deserialize. Please align the documentation with LocalModelsConfig/LocalModelDef.
| [[local_models.models]] | |
| id = "qwen3-35b" | |
| path = "~/.cache/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-MLX-8bit" | |
| backend = "mlx_lm" | |
| port = 8888 | |
| [[local_models.models]] | |
| id = "gemma4-26b" | |
| path = "~/.cache/huggingface/hub/models--unsloth--gemma-4-26b-a4b-it-MLX-8bit" | |
| backend = "mlx_lm" | |
| port = 8888 | |
| [local_models.mlx_lm] | |
| host = "127.0.0.1" | |
| port = 8888 | |
| [[local_models.models]] | |
| id = "qwen3-35b" | |
| hf_id = "unsloth/Qwen3.6-35B-A3B-MLX-8bit" | |
| provider_type = "mlx_lm" | |
| display_name = "Qwen3 35B" | |
| [[local_models.models]] | |
| id = "gemma4-26b" | |
| hf_id = "unsloth/gemma-4-26b-a4b-it-MLX-8bit" | |
| provider_type = "mlx_lm" | |
| display_name = "Gemma 4 26B" |
| #[tokio::test] | ||
| async fn test_caching_behavior() { | ||
| // Create a mock router with a single provider | ||
| let _router = TraceloopRouter::new(vec![ProviderConfig { | ||
| id: "test".to_string(), | ||
| r#type: ProviderType::OpenAI, | ||
| api_key: "test-key".to_string(), | ||
| base_url: Some("http://localhost:9999".to_string()), | ||
| default_model: "test-model".to_string(), | ||
| }]) | ||
| .unwrap(); | ||
|
|
||
| let _messages = [ChatMessage { | ||
| role: "user".to_string(), | ||
| content: Some(MessageContent::Text("Test message".to_string())), | ||
| name: None, | ||
| tool_calls: None, | ||
| tool_call_id: None, | ||
| reasoning: None, | ||
| reasoning_content: None, | ||
| }]; // test only verifies code compiles (mock provider not running) | ||
|
|
||
| // First request should miss cache | ||
| // Note: This will fail because the mock provider can't connect to localhost:9999 | ||
| // In a real test, we'd use a mock HTTP server | ||
| // For now, we just verify the code compiles and the caching logic is sound | ||
| println!("Caching test would run here with proper mock setup"); | ||
| } |
There was a problem hiding this comment.
This test module doesn’t currently test caching behavior: test_caching_behavior contains no assertions and just prints a message. Either convert it into a real test using a mock HTTP server (e.g. mockito) to assert cache hits/misses, or remove/mark it clearly as a placeholder so it doesn’t give a false sense of coverage.
| // Unit tests for auth module | ||
| // These are the kinds of tests mutation testing works against | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
| use crate::config::PolyConfig; | ||
| use crate::proxy::auth::{check_model_access, model_matches, ProxyAccessPolicy}; | ||
|
|
||
| #[test] | ||
| fn test_model_matches_exact() { | ||
| // Exact match should work | ||
| assert!(model_matches("deepseek-chat", "deepseek-chat")); | ||
| assert!(model_matches("kimi/kimi-for-coding", "kimi/kimi-for-coding")); |
There was a problem hiding this comment.
crates/zeroclawed/src/proxy/auth_test.rs appears to be an orphaned test file: it isn’t referenced from the module tree, and it uses an old check_model_access(...).is_ok()/is_err() API that doesn’t match the current bool return type. Consider either wiring it in via #[cfg(test)] mod auth_test; (after updating it to compile) or deleting it to avoid confusion and bitrot.
| let context = serde_json::json!({ | ||
| "agent_id": "claude-code", | ||
| "cwd": req.cwd.unwrap_or_default(), | ||
| "session_id": req.session_id.unwrap_or_default(), | ||
| }); | ||
|
|
||
| let result = state | ||
| .engine | ||
| .evaluate(&req.tool_name, &req.tool_input, Some("claude-code")) | ||
| .await; | ||
|
|
||
| let (decision, reason) = match result.verdict.to_string().as_str() { | ||
| "deny" => ("deny", result.reason), | ||
| "review" => ("ask", result.reason), | ||
| _ => ("allow", None), | ||
| }; | ||
|
|
||
| info!( | ||
| tool = %req.tool_name, | ||
| decision = %decision, | ||
| reason = ?reason, | ||
| "claude-code hook evaluated" | ||
| ); | ||
|
|
||
| // Suppress unused variable warning — context available for future use | ||
| let _ = context; | ||
|
|
There was a problem hiding this comment.
This handler builds a context object with cwd/session_id, but it isn’t passed into PolicyEngine::evaluate (the engine constructs its own context), so cwd will always be empty in Starlark policies. Either remove the unused context construction, or extend PolicyEngine::evaluate to accept and merge additional context fields so policies like claude-code-policy.star can actually use cwd/session_id.
|
|
||
| // Thread 1: Register providers | ||
| let registry1 = registry.clone(); | ||
| let p1 = provider1.clone(); | ||
| let p2 = provider2.clone(); | ||
| let t1 = thread::spawn(move || { | ||
| registry1.register(p1); | ||
| registry1.register(p2); | ||
| }); | ||
|
|
||
| // Thread 2: Read while writing | ||
| let registry2 = registry.clone(); | ||
| let t2 = thread::spawn(move || { | ||
| // This might read empty or partial state | ||
| let _ids = registry2.list_ids(); | ||
| let _provider = registry2.get("provider1"); | ||
| }); |
There was a problem hiding this comment.
The Loom tests call registry.clone(), but ProviderRegistry doesn’t implement Clone, so this will not compile under cfg(loom). Consider wrapping the registry in crate::sync::Arc inside the Loom test (and cloning the Arc), or implement Clone for ProviderRegistry by storing the inner map behind an Arc<RwLock<...>>.
| let result = tokio::task::spawn_blocking({ | ||
| let path = path.to_string(); | ||
| let input = input.clone(); | ||
| move || -> Result<Vec<u8>> { | ||
| use std::io::Write; | ||
| use std::process::{Command, Stdio}; | ||
|
|
||
| let mut child = Command::new(&path) | ||
| .stdin(Stdio::piped()) | ||
| .stdout(Stdio::piped()) | ||
| .stderr(Stdio::piped()) | ||
| .spawn() | ||
| .with_context(|| format!("failed to spawn hook {path}"))?; | ||
|
|
||
| child | ||
| .stdin | ||
| .take() | ||
| .unwrap() | ||
| .write_all(&input) | ||
| .context("writing hook stdin")?; | ||
|
|
||
| let out = child.wait_with_output().context("waiting for hook")?; | ||
|
|
||
| if !out.stderr.is_empty() { | ||
| let msg = String::from_utf8_lossy(&out.stderr); | ||
| warn!(hook = %path, stderr = %msg.trim(), "hook stderr"); | ||
| } | ||
|
|
||
| if out.status.success() { | ||
| Ok(out.stdout) | ||
| } else { | ||
| anyhow::bail!("hook exited with status {}", out.status); | ||
| } | ||
| } | ||
| }) | ||
| .await; |
There was a problem hiding this comment.
run_hook spawns an external process and waits for it to finish, but there’s no timeout/kill logic. A hung hook will stall the request indefinitely even though the comment implies bounded time. Consider enforcing a timeout (e.g., tokio::time::timeout around spawn_blocking and/or killing the child after a deadline) so voice requests can fail fast or degrade gracefully.
| let client = reqwest::Client::builder() | ||
| .timeout(Duration::from_secs(config.timeout_seconds)) | ||
| .build() | ||
| .context("building reqwest client")?; |
There was a problem hiding this comment.
A new reqwest::Client is constructed for every STT/TTS request. Creating clients is relatively expensive (connection pools, TLS config) and can become a throughput bottleneck. Consider reusing a single client (e.g., store it in ProxyState or cache per-upstream config) rather than rebuilding per call.
| "$CARGO" build --release -p clashd -p zeroclawed 2>&1 \ | ||
| | grep -E "^error|Compiling|Finished" || true | ||
| ok "Build complete" |
There was a problem hiding this comment.
The cargo build pipeline masks build failures (... | grep ... || true) even though set -euo pipefail is enabled. If the build fails, the script will still print "Build complete" and then cp non-existent binaries, producing confusing errors. Consider removing the || true (and/or using grep ... || { cat; exit 1; }) so the script aborts on build failure while still filtering output.
| let unified_context_store = UnifiedContextStore::new( | ||
| config.context.buffer_size, | ||
| config.context.inject_depth, | ||
| config.context.persistent.as_ref(), | ||
| ) | ||
| .await?; | ||
|
|
||
| let context_store_arc = unified_context_store.into_in_memory()?; | ||
|
|
||
| // Clone the inner ContextStore for channel functions | ||
| let context_store = (*context_store_arc).clone(); |
There was a problem hiding this comment.
UnifiedContextStore::new(...) can return a persistent store when persistent-context is enabled and [context.persistent] is configured, but main immediately calls into_in_memory()?, which will bail for the persistent variant. This makes the persistent-context feature unusable in practice. Consider plumbing UnifiedContextStore (or a dyn ContextStoreTrait) through channels/CommandHandler instead of forcing an in-memory ContextStore clone.
- Delete three orphaned files never wired into the module tree (retry_gateway.rs, auth_test.rs, delegation.rs) - Read mock channel control_port from config (was hardcoded 9090) - Document mock /send as an intentional stub with clear scope note - Remove unused context object in clashd claude-code hook; log cwd/session_id directly in the tracing fields instead - Fix setup-claude-hooks.sh: propagate cargo build failures instead of masking with || true; gate launchctl section on macOS - Share a single reqwest::Client across voice forward calls (OnceLock) with per-request timeout; add 30s timeout to run_hook subprocess - Stop sending "Bearer no-key" to unauthenticated HTTP backends: pass None through BackendConfig and skip Authorization header when api_key is empty - Fix README: model routing patterns (llama* → llama/*), local_models schema (path/backend/port → hf_id/provider_type + mlx_lm section) - Add comment explaining persistent-context feature is intentionally deferred until plumbed through all channel/handler call sites Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…voice pipeline (#10) * feat: add AI model proxy, local model lifecycle, multi-provider routing, and voice pipeline Adds the following new capabilities on top of main: - OpenAI-compatible HTTP proxy server ([proxy] config) - Multi-provider routing via [[proxy.providers]] and [[proxy.model_routes]] - Local model lifecycle management ([local_models] config) with mlx_lm/llama.cpp support and hot-swap via POST /control/local/switch - Voice pipeline passthrough: POST /v1/audio/transcriptions, /v1/audio/speech, GET /v1/tools/manifest with optional shell hooks - Model alloy blending ([[alloys]]) with weighted/round_robin strategies - Traceloop and Helicone gateway integrations - Matrix channel rewritten with raw HTTP (removes matrix-sdk dependency) - Config validator (--validate flag) - Mock channel for testing - Persistent context store (optional feature) - Agent delegation and slash command interception scaffold - exclude loom-tests from default cargo test via workspace default-members * style: apply cargo fmt across zeroclawed and clashd * feat(clashd): add Claude Code hook endpoint, policy, and setup script Adds /hooks/claude-code endpoint that speaks Claude Code's PreToolUse hookSpecificOutput format, a Starlark policy tuned for Claude Code tools, and scripts/setup-claude-hooks.sh to wire clashd as the policy engine for Claude Code (builds, installs, launchd service, settings.json update). * fix: address AI review feedback - retry_gateway: use retry_if predicate so 4xx/non-retryable errors are not retried (was computing should_retry but ignoring it) - persistent_context: remove unused Arc and Mutex imports - mock channel: remove advertised but unimplemented control_port config - README: fix model switch example (model_id -> model) - style: cargo fmt --all (clashd missed in previous fmt commit) * feat(clashd): add zeroclaw audit hook endpoint and agent setup script Adds POST /hooks/zeroclaw-audit to receive zeroclaw webhook_audit fire-and-forget payloads — evaluates against policy and logs/warns on deny verdicts (monitoring only, zeroclaw does not read the response). Adds scripts/setup-agents.sh: detect-or-install opencode (brew), openclaw (npm), zeroclaw (brew) and wire clashd policy integration for each — zeroclaw webhook_audit, openclaw exec-approvals (restricted+ask), opencode plugin stub. Supports --configure-only, --install-only, --agents. * refactor: rename security-gateway→security-proxy, clarify model gateway naming security-gateway crate renamed to security-proxy (binary, package name, workspace deps, crate path). Default port changed 8080→8888 to avoid conflict with the model gateway (8080). Adds SECURITY_PROXY_PORT env var override consistent with CLASHD_PORT. Removes misleading "Alloy proxy/Alloy Model Proxy Server" labels from the model gateway (proxy/mod.rs, handlers.rs, backend.rs) — alloys are one routing feature; the gateway also includes Traceloop observability, retries, Helicone, multi-provider routing, and local model management. * feat: add unified install.sh — builds all binaries, wires all agents Single entrypoint replacing setup-claude-hooks.sh + setup-agents.sh. Builds zeroclawed, clashd, security-proxy (release), installs to ~/.local/bin/, creates launchd services for both clashd and security-proxy. Wires clashd policy hooks for all four agents: Claude Code (PreToolUse hook), opencode (plugin stub), openclaw (exec-approvals), zeroclaw (webhook_audit + autonomy). Prompts before installing any missing tool (--yes skips prompts; --configure-only skips installs entirely). Correctly detects zeroclaw needing onboard before attempting service start. * feat(install): add multi-node SSH deployment for Proxmox/homelab clusters install.sh gains --nodes-file <path> and --nodes-only flags. For each node in the JSON config: cross-compile locally (cross/zigbuild) or fall back to building on the remote via SSH; rsync binary + policy files; install systemd service (Linux) or launchd plist (macOS). Adds deploy/nodes.example.json documenting the node config format (host, user, ssh_key, arch, os, services, install_dir, config_dir). Supports x86_64/aarch64 Linux and macOS nodes from one build machine. * fix: resolve all clippy and fmt failures from CI - security-proxy: security_gateway → security_proxy in integration tests - adversary-detector: sort_by → sort_by_key - host-agent/pct: collapse nested if into match guard - zeroclawed: redundant field names (shorthand), manual checked_div, strip_suffix, collapsible if let in matrix.rs, useless vec!, assert! for bool, let-else for single-variant match in main.rs (extracted into_in_memory() method on UnifiedContextStore to avoid irrefutable pattern lint across feature configurations) * fix: rename security-gateway to security-proxy in CI workflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address all active Copilot review comments - Delete three orphaned files never wired into the module tree (retry_gateway.rs, auth_test.rs, delegation.rs) - Read mock channel control_port from config (was hardcoded 9090) - Document mock /send as an intentional stub with clear scope note - Remove unused context object in clashd claude-code hook; log cwd/session_id directly in the tracing fields instead - Fix setup-claude-hooks.sh: propagate cargo build failures instead of masking with || true; gate launchctl section on macOS - Share a single reqwest::Client across voice forward calls (OnceLock) with per-request timeout; add 30s timeout to run_hook subprocess - Stop sending "Bearer no-key" to unauthenticated HTTP backends: pass None through BackendConfig and skip Authorization header when api_key is empty - Fix README: model routing patterns (llama* → llama/*), local_models schema (path/backend/port → hf_id/provider_type + mlx_lm section) - Add comment explaining persistent-context feature is intentionally deferred until plumbed through all channel/handler call sites Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Adds substantial new capabilities on top of main with a clean single commit rebased directly onto main HEAD:
[proxy]config block,POST /v1/chat/completions,GET /v1/models,GET /health[[proxy.providers]]and[[proxy.model_routes]]pattern-matched to different backend URLs/keys/timeouts[local_models]manages mlx_lm.server or llama-server processes;POST /control/local/switchhot-swaps models at runtime[proxy.voice.stt/tts]routes audio to any OpenAI-compatible server;GET /v1/tools/manifestsurfaces capabilities as tool definitions; optional shell hooks for pre/post processing[[alloys]]weighted/round_robin blending across multiple providers--validateflag for config file checking at deploy timeReplaces PR #9 (which had git history conflicts from the filter-branch operation).
Test plan
cargo test— all 18 test suites passcargo build --release -p zeroclawedsucceeds--validatevalidates config and exits cleanlyPOST /control/local/switchswitches local modelGET /v1/tools/manifestreflects configured capabilities