Skip to content

Fix support for feeds with XML preample + DTD#35

Merged
Alkarex merged 2 commits intofreshrssfrom
fix-xml-dtd
Apr 18, 2025
Merged

Fix support for feeds with XML preample + DTD#35
Alkarex merged 2 commits intofreshrssfrom
fix-xml-dtd

Conversation

@Alkarex
Copy link
Member

@Alkarex Alkarex commented Apr 18, 2025

Simple feeds with XML preample + DTD are crashing (see example in test) due to regression from simplepie@162a3d3

The culprit feature added support for html entities in xml seems to come from a misunderstanding of https://www.rssboard.org/rss-encoding-examples
And its implementation is buggy anyway, as it creates a second DOCTYPE declaration even when there is an existing one, which is completely invalid.

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514

Upstream PR: simplepie#914

Simple feeds with XML preample + DTD are crashing (see example in test) due to regression from simplepie@162a3d3

The culprit feature *added support for html entities in xml* seems to come from a misunderstanding of
https://www.rssboard.org/rss-encoding-examples

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514
if ($declaration->parse()) {
$data = substr($data, $pos + 2);
$data = '<?xml version="' . $declaration->version . '" encoding="' . $encoding . '" standalone="' . (($declaration->standalone) ? 'yes' : 'no') . '"?>' ."\n". $this->declare_html_entities() . $data;
$data = '<?xml version="' . $declaration->version . '" encoding="' . $encoding . '" standalone="' . (($declaration->standalone) ? 'yes' : 'no') . '"?>' . "\n" . $data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of overriding the XML declaration anyway? Some kind of encoding issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

@Alkarex Alkarex merged commit 2e668d7 into freshrss Apr 18, 2025
20 checks passed
Alkarex added a commit to Alkarex/FreshRSS that referenced this pull request Apr 18, 2025
Alkarex added a commit to FreshRSS/FreshRSS that referenced this pull request Apr 18, 2025
Alkarex added a commit that referenced this pull request Jun 23, 2025
Alternative of simplepie#914 with support for XML feeds using HTML entities

Simple feeds with XML preample + DTD are crashing (see example in test) due to regression from simplepie@162a3d3
Its implementation is buggy, as it creates a second DOCTYPE declaration even when there is an existing one, which is completely invalid.

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514
Downstream PR: #35

Note that the culprit feature *added support for html entities in xml* seems to come from a misunderstanding of
https://www.rssboard.org/rss-encoding-examples
HTML entities are only allowed in XML when there is a DTD declaring them. SimplePie is even allowing Atom documents with undeclared HTML entities - which I am not sure is on purpose.

Downstream issue: FreshRSS/FreshRSS#7687
Alkarex added a commit that referenced this pull request Jun 23, 2025
Alternative of simplepie#914 with support for XML feeds using HTML entities

Simple feeds with XML preample + DTD are crashing (see example in test) due to regression from simplepie@162a3d3
Its implementation is buggy, as it creates a second DOCTYPE declaration even when there is an existing one, which is completely invalid.

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514
Downstream PR: #35

Note that the culprit feature *added support for html entities in xml* seems to come from a misunderstanding of
https://www.rssboard.org/rss-encoding-examples
HTML entities are only allowed in XML when there is a DTD declaring them. SimplePie is even allowing Atom documents with undeclared HTML entities - which I am not sure is on purpose.

Downstream issue: FreshRSS/FreshRSS#7687
@Alkarex
Copy link
Member Author

Alkarex commented Jun 23, 2025

Partial revert (allow HTML entities again) and new approach in #37

Alkarex added a commit that referenced this pull request Jun 24, 2025
* Fix support for feeds with XML preample + DTD, entities
Alternative of simplepie#914 with support for XML feeds using HTML entities

Simple feeds with XML preample + DTD are crashing (see example in test) due to regression from simplepie@162a3d3
Its implementation is buggy, as it creates a second DOCTYPE declaration even when there is an existing one, which is completely invalid.

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514
Downstream PR: #35

Note that the culprit feature *added support for html entities in xml* seems to come from a misunderstanding of
https://www.rssboard.org/rss-encoding-examples
HTML entities are only allowed in XML when there is a DTD declaring them. SimplePie is even allowing Atom documents with undeclared HTML entities - which I am not sure is on purpose.

Downstream issue: FreshRSS/FreshRSS#7687

* Fix merge error
@Alkarex Alkarex deleted the fix-xml-dtd branch June 24, 2025 06:38
jtojnar pushed a commit to simplepie/simplepie that referenced this pull request Jun 29, 2025
Alternative of #914 with support for XML feeds using HTML entities

Simple feeds with XML preamble + DOCTYPE are crashing (see example in test) due to regression from 162a3d3
Its implementation is buggy, as it creates a second DOCTYPE declaration even when there is an existing one, which is completely invalid.

Example of feed: https://blog.plover.com/index.rss
Downstream Issue: FreshRSS/FreshRSS#7514
Downstream PR: FreshRSS#35

Note that the culprit feature *added support for html entities in xml* seems to come from a misunderstanding of
https://www.rssboard.org/rss-encoding-examples
HTML entities are only allowed in XML when there is a DTD declaring them. SimplePie is even allowing Atom documents with undeclared HTML entities - which I am not sure is on purpose.

Downstream issue: FreshRSS/FreshRSS#7687
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants