Skip to content

fix(lexers): prevent ReDoS in archetype lexer GUID and ID patterns#3064

Merged
birkenfeld merged 1 commit intopygments:masterfrom
zsbahtiar:fix/archetype-redos-backtracking
Mar 25, 2026
Merged

fix(lexers): prevent ReDoS in archetype lexer GUID and ID patterns#3064
birkenfeld merged 1 commit intopygments:masterfrom
zsbahtiar:fix/archetype-redos-backtracking

Conversation

@zsbahtiar
Copy link
Copy Markdown
Contributor

@zsbahtiar zsbahtiar commented Mar 25, 2026

The GUID-matching regex at line 296 uses nested repeating quantifiers (\d|[a-fA-F])+(-(\d|[a-fA-F])+){3,} which causes catastrophic backtracking on crafted input. Additionally, the archetype_id pattern's unbounded \w+ causes O(n²) behavior when the lexer tests the pattern at every position of llong non-matching input.

BEFORE

python3 -c "
import time
from pygments.lexers import AdlLexer
from pygments import lex
malicious_input = 'A' * 10000 + '-'
lexer = AdlLexer()
start = time.time()
list(lex(malicious_input, lexer))
elapsed = time.time() - start
print(f'Elapsed time: {elapsed:.2f}s')
"
Elapsed time: 4.34s

AFTER

python3 -c "
import time
from pygments.lexers import AdlLexer
from pygments import lex
malicious_input = 'A' * 10000 + '-'
lexer = AdlLexer()
start = time.time()
list(lex(malicious_input, lexer))
elapsed = time.time() - start
print(f'Elapsed time: {elapsed:.2f}s')
"
Elapsed time: 0.11s

Issues:

@MRigal
Copy link
Copy Markdown

MRigal commented Mar 25, 2026

and solves #3065 (duplicate)

@zsbahtiar
Copy link
Copy Markdown
Contributor Author

kindly check, thanks @birkenfeld @BruceMcRooster

@mehroz-muzaffar-32
Copy link
Copy Markdown

@zsbahtiar any timeline on when it is expected to be merged?

@zsbahtiar
Copy link
Copy Markdown
Contributor Author

zsbahtiar commented Mar 25, 2026

any timeline on when it is expected to be merged?

not sure, the sooner the better though @mehroz-muzaffar-32
maybe @birkenfeld @Anteru can answer

Copy link
Copy Markdown
Contributor

@BruceMcRooster BruceMcRooster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I successfully reproduced the performance degradation and improvement on the author's example for the lines 38–39 changes. This does seem to be the result of catastrophic backtracking in the archetype expression, because reverting just that resurfaced the issue.
I could not produce a meaningful performance difference for the line 296 change, but this is of course more of a hardening than an actual fix, so this is somewhat expected.

I also looked over some examples of ADL code (I've never actually worked with it before), which seem to suggest the artificial 100-character limit is reasonable.

As a last note being a little pedantic, the extra-long line length was concerning me a little bit, but upon investigating other files (wc -L), line 38's 109 characters is certainly not the worst lexer, so this seems acceptable to leave. There also isn't, semantically, a great way to break the expression except perhaps to include a newline after the namespace (the thing followed by ::), because otherwise you break up the identity and concept name, which seems (from my limited understanding of ADL archetypes) to fragment at an understanding level.

Overall, seems to effectively prevent ReDoS for this scenario, and I consider it ready to merge.

docxology added a commit to ActiveInferenceInstitute/GeneralizedNotationNotation that referenced this pull request Mar 25, 2026
PyPI remains at 2.19.2; use git pin to zsbahtiar/pygments@2be805d (pygments/pygments#3064) until an official release >2.19.2. Regenerated uv.lock.

Made-with: Cursor
@lucag-fenergo
Copy link
Copy Markdown

Please +1 more approval for this

@pellepim
Copy link
Copy Markdown

Thorough. 👍

@birkenfeld
Copy link
Copy Markdown
Member

Well, this shouldn't do any harm at least.

Thanks for the fix, I'll try to get a release out this weekend.

As others have said, this is not considered a security bug. I just don't have energy to "fight" the CVE/GHSA.

@xgboosted
Copy link
Copy Markdown

Thanks for the fix, I'll try to get a release out this weekend.

Can the release happen ASAP please? The CI pipelines are breaking since the last couple of days.

Thank you 😊

@phw
Copy link
Copy Markdown

phw commented Mar 26, 2026

@xgboosted see birkenfeld's comment above

Edit: And also #3065 (comment)

@birkenfeld birkenfeld mentioned this pull request Mar 26, 2026
@Anteru Anteru added this to the 2.20.0 milestone Mar 26, 2026
@Anteru Anteru added the A-lexing area: changes to individual lexers label Mar 26, 2026
@kierun
Copy link
Copy Markdown

kierun commented Mar 27, 2026

Thanks for the fix, I'll try to get a release out this weekend.

Can the release happen ASAP please? The CI pipelines are breaking since the last couple of days.

Thank you 😊

If this is an issue, you can always ignore this one CVE. Most tools will allow you to do that…

@gwpmad
Copy link
Copy Markdown

gwpmad commented Mar 27, 2026

Thanks for the fix, I'll try to get a release out this weekend.

Can the release happen ASAP please? The CI pipelines are breaking since the last couple of days.
Thank you 😊

If this is an issue, you can always ignore this one CVE. Most tools will allow you to do that…

Yep, if it's Pip Audit that's complaining --ignore-vuln CVE-2026-4539 works

jcbianic added a commit to jcbianic/apicurio-serdes that referenced this pull request Mar 27, 2026
Pygments <=2.19.2 has a ReDoS in the ADL lexer (CVE-2026-4539).
The fix is merged upstream (pygments/pygments#3064) but not yet
released. Pygments is a dev-only transitive dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
KingPegasus added a commit to KingPegasus/XrayRadar that referenced this pull request Mar 28, 2026
- **Email templates**
  - HTML emails now use the new PNG logo for branding in the header.
  - When `XRAYRADAR_BASE_URL` is localhost (or otherwise unsuitable for loading remote images), templates show a **text “XrayRadar” fallback** instead of a broken logo image.
- **Dependencies**
  - `requests>=2.33.0,<3.0` for CVE-2026-25645 (transitive via `resend`).
  - Dev extra: `pygments` installed from git **master** (includes Pygments [#3064](pygments/pygments#3064) for CVE-2026-4539); `uv.lock` pins the resolved commit. Revert to a normal PyPI constraint when a patched release ships.
  - Dev extra: `pip-audit` so `uv run pip-audit` audits the project `.venv`. CI and docs pass `--ignore-vuln CVE-2026-4539` because git-installed Pygments still reports version `2.19.2` in metadata until a patched PyPI release; remove the flag when switching back to PyPI Pygments.
@Anteru
Copy link
Copy Markdown
Collaborator

Anteru commented Mar 29, 2026

Pyments 2.20 which fixes this has been released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-lexing area: changes to individual lexers

Projects

None yet

Development

Successfully merging this pull request may close these issues.