Skip to content

Agents, server-side tools and MCP#34

Merged
ScriptSmith merged 34 commits into
mainfrom
agents
May 24, 2026
Merged

Agents, server-side tools and MCP#34
ScriptSmith merged 34 commits into
mainfrom
agents

Conversation

@ScriptSmith

Copy link
Copy Markdown
Member

Adds:

  • Responses server-side tool execution loop, background mode
  • Containers and a server-side shell tool
  • Server-side MCP

@greptile-apps

greptile-apps Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds server-side agent capabilities to Hadrian: a Responses API tool-execution loop with background mode (queued + persisted), container sessions with a server-side shell tool, and server-side MCP with per-call approval gating. All three previous P1 findings from the prior review are addressed in this revision.

  • Background responses: a background=true flag queues the request as a DB row; a semaphore-bounded worker pool claims and executes rows via run_with_retry, which atomically resets status=in_progress and started_at before each attempt and only calls mark_background_failure after all retries are exhausted — fixing the prior terminal-state-before-retry race.
  • Inline domain-secret redaction: record_to_wire now calls redact_inline_domain_secrets on the echoed tools field so raw ShellDomainSecretInline.value strings stored for background execution are never returned to clients via GET /v1/responses/{id} — fixing the prior echo-in-plaintext issue.
  • Container file uploads: normalize_mnt_path applies the .. traversal guard uniformly to both the explicit path field and the Content-Disposition filename fallback, and file data is now read in chunks with a running byte-count check — fixing the two prior container-upload findings.

Confidence Score: 5/5

The three blocking issues from the prior review are all addressed; the only remaining finding is a non-blocking suggestion around hash stability for inline skill IDs.

All the significant correctness issues flagged in earlier reviews are resolved. The new code correctly defers terminal-state writes to after retry exhaustion, redacts inline secrets on retrieval, and guards both upload path branches against traversal. The one open finding (DefaultHasher for inline skill synthetic IDs) only matters during rolling binary upgrades and does not affect correctness within a single binary deployment.

src/services/responses_pipeline.rs — inline_skill_synthetic_id uses DefaultHasher whose stability across Rust versions is unspecified.

Important Files Changed

Filename Overview
src/services/background_executor.rs Correctly addresses prior review: stream errors no longer write terminal state before retries; failure is deferred to mark_background_failure after all retries are exhausted.
src/routes/api/responses_lookup.rs Prior P1 fix applied: inline domain-secret values are now redacted in record_to_wire via redact_inline_domain_secrets before echoing the tools field; SSE replay loop correctly terminates on terminal event types and on zero-event terminal rows.
src/services/responses_pipeline.rs Background and foreground paths correctly store the original (pre-skill-rewrite) instructions so skill preambles are never leaked through GET endpoints. DefaultHasher used for inline-skill synthetic IDs lacks a cross-version stability guarantee (P2).
src/routes/api/containers.rs Prior path-traversal finding addressed: normalize_mnt_path now applies the .. guard uniformly to both the explicit path_field and the filename fallback; incremental chunk-by-chunk file size cap check replaces the previous full-buffer approach.
src/services/mcp/client.rs Prior panic-on-multibyte-boundary issue fixed: strip_bearer_prefix now uses str::get(..7) which returns None on invalid boundaries instead of the panicking direct slice.
src/jobs/background_responses.rs run_with_retry atomically resets status+started_at before each attempt to prevent reaper interference; semaphore-bounded concurrency prevents response flood from saturating worker capacity.
src/services/mcp/executor.rs Approval gating correctly fails closed when persistence is unavailable; require_approval semantics (never/always/filter) are cleanly implemented including the read_only_hint matching rule.
src/jobs/responses_retention.rs Retention worker correctly reaps stuck in_progress rows and prunes expired response rows under a cluster-wide leader lock; MCP pending-approval sweep integrated in the same pass.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler as POST /v1/responses
    participant DB
    participant Worker as Background Worker
    participant LLM as LLM Provider
    participant Shell as Shell Runtime (Sandbox)
    participant MCP as MCP Server

    Client->>Handler: "POST /v1/responses (background=true)"
    Handler->>DB: "INSERT responses (status=queued)"
    Handler-->>Client: "202 Accepted { id, status: queued }"

    Worker->>DB: claim_queued (SELECT FOR UPDATE SKIP LOCKED)
    DB-->>Worker: ResponseRecord
    Worker->>LLM: "execute_with_fallback (stream=true)"
    LLM-->>Worker: SSE stream

    loop Tool execution loop
        LLM-->>Worker: response.output_item.done (shell_call)
        Worker->>Shell: ExecRequest (commands, egress_policy, secrets)
        Shell-->>Worker: ExecEvent stream (stdout/stderr/exit_code)
        Worker->>LLM: continuation (function_call_output)

        LLM-->>Worker: response.output_item.done (mcp_call)
        Worker->>MCP: tools/call (with auth headers)
        MCP-->>Worker: CallToolResult
        Worker->>LLM: continuation (function_call_output)
    end

    LLM-->>Worker: response.completed
    Worker->>DB: "UPDATE responses (status=completed, last_seq_no)"

    Client->>Handler: "GET /v1/responses/{id}?stream=true"
    Handler->>DB: list_events_after(cursor)
    DB-->>Handler: SSE event log
    Handler-->>Client: "SSE replay -> data: [DONE]"
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
src/services/responses_pipeline.rs:342-348
**Unstable hash may break inline-skill mount paths across binary versions**

`DefaultHasher::new()` seeds to zero and is deterministic within a single binary, but Rust's documentation explicitly states the algorithm "is subject to change at any point in the future" — it has already changed once (SipHash 2-4 → SipHash 1-3). During a rolling deployment where a queued background response is created by binary V1 but executed by binary V2, the LLM's instructions would reference `/skills/hash_A` while the runtime mounts files at `/skills/hash_B`, causing skill files to be unreachable to every shell command the model issues in that request. A stable, content-addressed hash such as a truncated SHA-256 (or `uuid::Uuid::new_v5` with a fixed namespace) carries an explicit stability contract and costs roughly the same.

Reviews (7): Last reviewed commit: "Improve testing instructions" | Re-trigger Greptile

Comment thread src/services/mcp/client.rs
Comment thread src/routes/api/containers.rs Outdated
@ScriptSmith

Copy link
Copy Markdown
Member Author

@greptile-apps

Comment thread src/services/background_executor.rs
@ScriptSmith

Copy link
Copy Markdown
Member Author

@greptile-apps

Comment thread src/routes/api/chat.rs
Comment on lines +309 to +316
fn serialize_payload_for_storage(
payload: &crate::api_types::CreateResponsesPayload,
) -> serde_json::Value {
let mut value = serde_json::to_value(payload).unwrap_or(serde_json::Value::Null);
strip_input_file_data(&mut value);
strip_mcp_credentials(&mut value);
value
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Inline domain secret values stored and echoed in plaintext

serialize_payload_for_storage strips MCP authorization/headers credentials but does not strip ShellDomainSecretInline.value. A caller who sends a shell tool request with network_policy.domain_secrets[].type = "inline" (carrying a raw secret value like an API key) will have that value persisted verbatim in responses.request_payload. The field is then echoed back via GET /v1/responses/{id} because "tools" is in ECHO_FIELDS at responses_lookup.rs:93. The same MCP-credential-stripping pattern should apply here: redact (or omit) inline.value from the stored payload, mirroring how authorization is removed from mcp entries.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/routes/api/chat.rs
Line: 309-316

Comment:
**Inline domain secret values stored and echoed in plaintext**

`serialize_payload_for_storage` strips MCP `authorization`/`headers` credentials but does not strip `ShellDomainSecretInline.value`. A caller who sends a shell tool request with `network_policy.domain_secrets[].type = "inline"` (carrying a raw secret value like an API key) will have that value persisted verbatim in `responses.request_payload`. The field is then echoed back via `GET /v1/responses/{id}` because `"tools"` is in `ECHO_FIELDS` at `responses_lookup.rs:93`. The same MCP-credential-stripping pattern should apply here: redact (or omit) `inline.value` from the stored payload, mirroring how `authorization` is removed from `mcp` entries.

How can I resolve this? If you propose a fix, please make it concise.

@ScriptSmith ScriptSmith changed the title Agents Agents, server-side tools and MCP May 24, 2026
@ScriptSmith ScriptSmith merged commit daed2c1 into main May 24, 2026
19 of 20 checks passed
@ScriptSmith ScriptSmith deleted the agents branch May 30, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant