Skip to content

fix(gateway): harden WeCom callback against XML DoS and XXE#8686

Open
Dusk1e wants to merge 1 commit into
NousResearch:mainfrom
Dusk1e:fix/wecom-xml-dos-protection
Open

fix(gateway): harden WeCom callback against XML DoS and XXE#8686
Dusk1e wants to merge 1 commit into
NousResearch:mainfrom
Dusk1e:fix/wecom-xml-dos-protection

Conversation

@Dusk1e

@Dusk1e Dusk1e commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR hardens the WeCom callback endpoint against XML-based Denial of Service (DoS) and XML External Entity (XXE) attacks.

Why

The previous implementation directly parsed raw callback XML using ElementTree before validation. This exposed a vulnerability where large payloads or malicious DOCTYPE/ENTITY definitions could lead to resource exhaustion or information leakage.

Changes

  • Payload Sanitization: Added a 256 KB body size limit for WeCom callbacks.
  • Pre-parse Validation: Introduced regex-based extraction for the Encrypt field, avoiding full XML parsing for raw messages.
  • Parser Security: Added explicit checks to reject XML containing DOCTYPE or ENTITY tags before they reach the parser.
  • Decryption Safety: Only validated and decrypted content is processed, significantly reducing the attack surface.

Verification

Added comprehensive security tests in tests/gateway/test_wecom_callback.py:

  • Verified rejection of unsafe XML (XXE/Billion Laughs patterns).
  • Confirmed that large payloads (>256 KB) are blocked.
  • Validated that decryption only proceeds after pre-parse checks pass.

@LonggTeng

Copy link
Copy Markdown

Independent audit confirms the underlying issue this PR is addressing.

Two validation notes that may help review:

  1. The core risk is real because gateway/platforms/wecom_callback.py calls ET.fromstring(body) on the raw callback body before decryption or trust establishment.
  2. On the current Python runtime, stdlib xml.etree.ElementTree does expand internal entities. A small local test expanded a 349-byte XML payload into a 1,000,000-character text node, so the parser-amplification or DoS angle is not just theoretical.

One nuance: this endpoint is running under aiohttp, whose default client_max_size is 1 MiB unless overridden. So the impact is better described as pre-auth XML parser amplification and resource pressure, not unlimited raw body upload. The mitigation direction in this PR looks aligned with that threat model: avoid full XML parse on the untrusted outer body, reject dangerous constructs early, and apply a tighter body limit.

@alt-glitch alt-glitch added type/security Security vulnerability or hardening P1 High — major feature broken, no workaround platform/wecom WeCom / WeChat Work adapter comp/gateway Gateway runner, session dispatch, delivery labels Apr 28, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #10192 — both harden WeCom callback XML parsing. #10192 uses defusedxml + 64KB limit; this PR uses regex extraction + DOCTYPE/ENTITY rejection + 256KB limit. Consider consolidating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround platform/wecom WeCom / WeChat Work adapter type/security Security vulnerability or hardening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants