Skip to content

fix(web): block SSRF in web fetch#1544

Merged
esengine merged 1 commit into
esengine:mainfrom
T1anjiu:fix/web-fetch-ssrf-guard
May 22, 2026
Merged

fix(web): block SSRF in web fetch#1544
esengine merged 1 commit into
esengine:mainfrom
T1anjiu:fix/web-fetch-ssrf-guard

Conversation

@T1anjiu

@T1anjiu T1anjiu commented May 22, 2026

Copy link
Copy Markdown

Summary

Add SSRF protection to web_fetch so it cannot reach internal, loopback, or reserved network targets.

Problem

web_fetch previously accepted any http(s) URL and followed redirects automatically. That left a straightforward SSRF path:

  • direct access to 127.0.0.1 / ::1
  • access to private, link-local, and metadata IP ranges
  • redirect chains that start on a public host and end on an internal one

The impact is not just fetching the wrong page. It can expose local or LAN services to the model through a tool boundary that was meant to stay public.

Fix

  • In src/tools/web.ts, validate the scheme up front and only allow http: and https:.
  • Reject literal IPs that resolve to internal or reserved ranges.
  • Resolve hostnames with dns.lookup() and reject any result that lands in an internal range.
  • Switch redirect handling from automatic following to manual processing, and re-check every hop.
  • Add a redirect cap to avoid long or looping chains.
  • Keep the existing timeout and response-size limits intact.
  • Cover the behavior with regression tests for:
    • direct loopback access
    • DNS names that resolve to internal addresses
    • redirects into internal addresses

Verification

  • npx vitest run tests/web-tools.test.ts --silent
  • npm run typecheck
  • npm run lint
  • npm run verify

All passed. lint still reports one pre-existing warning in an unrelated test file.

Notes

A few things were worth checking while doing this:

  • The DNS check adds one extra lookup before the request, which is intentional.
  • The guard is conservative: if a hostname resolves to both public and internal addresses, it gets blocked.
  • Redirects are no longer automatically trusted, which is the right tradeoff for this tool.

@esengine esengine merged commit f0f1ac6 into esengine:main May 22, 2026
4 checks passed
esengine added a commit that referenced this pull request May 22, 2026
…se (#1565)

* chore(release): 0.49.0 — static-history TUI, queued steers, Bing default, lifecycle plans

Headline themes:
- TUI: Static-history renderer is the only path; virtual-viewport layers removed (#1529 stages 1-4)
- Chat: queued mid-turn steer handling so input mid-render doesn't drop or fight the live frame (#1501)
- Web search: default switches to Bing; dashboard engine switcher; Mojeek dropped (#1558)
- Plans: lifecycle evidence summaries surface why a plan is ready to accept (#1500)
- Desktop: native OS notifications for approvals + completion (#1519)
- i18n: CLI command output (/mcp /sessions /prune /theme) + approval-prompt labels translated (#1524, #1560)
- Security: SSRF block in web_fetch (#1544), edit-snapshot path containment (#1454), shell redirect sandbox (#1457), Task integrity guardrail (#1516)
- Tools: per-turn dispatch-rate limit (#1356); run_command discourages shell-based edits (#1514)
- Client: DeepSeek 429 → concurrency-limit hint (#1526); timeoutMs honored with AbortSignal (#1535); --no-proxy opt-out for direct route (#1507)
- Files: read/edit/restore preserves source encoding (GB18030 / UTF-8 BOM) (#1518)
- Context: pinned constraints survive folds + full tail capture (#1515, #1552)
- Refactor: lifecycle risk policy extracted into its own module (#1557)

See CHANGELOG for the full list.

* fix(context): align fold summary prefix with main agent for cache reuse

The summarizer call was sending a bespoke "You compress conversation
history" system prompt and no tools, guaranteeing a 0% cache hit
against the main agent's just-cached prefix. Reshape the request so
system + tools + head bytes mirror the live agent's last call — the
only novel bytes are the trailing summarize instruction.

Skill-pin handling now collects bodies read-only instead of stubbing
mid-head, so the cache prefix stays unbroken. The summarize
instruction names pinned skills so the model knows not to paraphrase
their bodies (which we append verbatim regardless).

Measured on a real session at 48.7K prompt tokens:
  OLD shape: 0.0% cache hit  → $0.145 per fold
  NEW shape: 99.6% cache hit → $0.015 per fold
  saving: 89.6% per fold

* tools: add fold-cache shape + live benchmarks

bench-fold-cache-shape.mjs replays real session jsonls, simulates
OLD vs NEW summary-call shapes at the fold point, and reports
byte-level shared-prefix with the main agent's preceding request.
Pure local — no API required.

bench-fold-cache-live.mjs sends one priming + two summary calls to
DeepSeek and reports prompt_cache_hit_tokens / cost for each shape.
Used to confirm the shape change actually translates to API-side
cache hits.

---------

Co-authored-by: reasonix <reasonix@deepseek.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants