Skip to content

fix(cli): refuse to extract zip entries that escape the destination (CWE-22 / Zip Slip)#7870

Closed
Jaeyoung Yun (JAE0Y2N) wants to merge 1 commit into
langchain-ai:mainfrom
JAE0Y2N:fix/cwe22-zipslip-safe-extract
Closed

fix(cli): refuse to extract zip entries that escape the destination (CWE-22 / Zip Slip)#7870
Jaeyoung Yun (JAE0Y2N) wants to merge 1 commit into
langchain-ai:mainfrom
JAE0Y2N:fix/cwe22-zipslip-safe-extract

Conversation

@JAE0Y2N

@JAE0Y2N Jaeyoung Yun (JAE0Y2N) commented May 20, 2026

Copy link
Copy Markdown

Resolves #7871.

While reviewing langgraph_cli.templates._download_repo_with_requests I noticed it passes the downloaded template archive straight through ZipFile.extractall(path). Python's extractall doesn't validate per-entry paths, so an archive containing entries like ../../etc/passwd (or an absolute path) writes outside the destination the user selected — classic Zip Slip / CWE-22.

In normal use the templates come from langchain-ai/* template repos over HTTPS, so the realistic precondition for an exploit is a supply-chain compromise of one of those template repos or TLS MITM against github.com. Both are unlikely, and I confirmed there's no user-controlled URL injection vector (the --template flag is hard-rejected against TEMPLATE_ID_TO_CONFIG, and TEMPLATES is a hardcoded dict). So this is defense-in-depth rather than an active vulnerability.

That said, the fix is one small helper and matches the standard hardening other Python scaffolders have already adopted (npm tar v7+, pnpx create-X-app flows, etc.). Refusing malformed archives at the helper boundary is cheap to add and protects future code paths that might pull from less-trusted sources (community templates, mirrors, internal proxies, test fixtures).

What the patch does

_safe_extract(zip_file, path) runs before extractall and rejects any entry whose normalized path would land outside path:

  • Entries with absolute paths or drive-prefixed paths are rejected outright.
  • Every remaining entry is resolved via os.path.realpath(os.path.join(real_dest, member)) and compared to os.path.realpath(path) via os.path.commonpath. If the resolved entry is not under the destination, raise.
  • The error is surfaced to the user (clear message, non-zero exit) rather than silently writing the offending files.

No behavior change for well-formed archives.

Test plan

  • python3 -c "import ast; ast.parse(open('libs/cli/langgraph_cli/templates.py').read())" — parses cleanly.
  • Manual reproducer (locally): build a 2-entry ZIP with one valid file and one entry named ../escape.txt, point _safe_extract at a temp dir, confirm it raises ValueError and exits non-zero. (Happy to add a unit test under libs/cli/tests/ if you'd like — let me know which test module is the right home; I don't see existing tests for templates.py.)
  • _download_repo_with_requests is unchanged for the well-formed path the production templates always take.

…CWE-22)

`_download_repo_with_requests` in `langgraph_cli/templates.py` passes the
downloaded template archive straight through `ZipFile.extractall(path)`.
Python's `extractall` does not validate per-entry paths, so a malicious
archive containing entries like `../../etc/passwd` or `/tmp/anywhere` will
write to whatever the running user can write to — outside the project
directory the operator selected.

Today the templates are fetched from langchain-ai/* repos over HTTPS, so
the realistic precondition is a supply-chain compromise of one of those
template repos or a TLS MITM between `request.urlopen` and github.com.
Both are unlikely in normal use — but defense-in-depth is cheap here and
mirrors the standard hardening that other Python project scaffolders
have already adopted.

This patch introduces `_safe_extract`, which:

  * rejects entries with absolute paths or drive-prefixed paths outright
  * resolves each entry against the destination via `os.path.realpath`
    and refuses any entry whose resolved path is not within the
    destination (`os.path.commonpath` check)
  * surfaces a clear error and exits non-zero if a malformed archive is
    seen, rather than silently writing files outside the project

The `*-main` cleanup loop is unchanged — it only runs after extraction
succeeds and only touches entries inside `path`.

No behavior change for well-formed archives (which is every archive the
helper currently sees in production); this only refuses inputs that
would have escaped the destination today.

Refs: CWE-22 (Path Traversal), specifically the Zip-Slip variant
(https://snyk.io/research/zip-slip-vulnerability).
@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

This PR has been automatically closed because you are not assigned to the linked issue.

External contributors must be assigned to an issue before opening a PR for it. Please:

  1. Comment on the linked issue to request assignment from a maintainer
  2. Once assigned, your PR will be reopened automatically

Maintainers: reopen this PR or remove the missing-issue-link label to bypass this check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[security] CLI: templates._download_repo_with_requests extracts ZIP archives without per-entry path validation (CWE-22 / Zip Slip)

1 participant