Skip to content

Better enclosures#4944

Merged
Alkarex merged 23 commits intoFreshRSS:edgefrom
Alkarex:better-enclosures
Jan 6, 2023
Merged

Better enclosures#4944
Alkarex merged 23 commits intoFreshRSS:edgefrom
Alkarex:better-enclosures

Conversation

@Alkarex
Copy link
Member

@Alkarex Alkarex commented Dec 14, 2022

#fix #4702
Improvement of #2898

@Alkarex Alkarex added this to the 1.21.0 milestone Dec 14, 2022
@Alkarex Alkarex marked this pull request as draft December 14, 2022 22:45
@Alkarex Alkarex marked this pull request as ready for review December 26, 2022 21:55
@Alkarex Alkarex mentioned this pull request Dec 29, 2022
@Alkarex Alkarex linked an issue Dec 29, 2022 that may be closed by this pull request
@math-GH
Copy link
Contributor

math-GH commented Dec 30, 2022

@Alkarex am I allowed to add my changes, that moves the HTML from content() to the .phtml? Or should I wait till this PR is merged and will open an additional PR later?

@Alkarex
Copy link
Member Author

Alkarex commented Dec 30, 2022

@Alkarex am I allowed to add my changes, that moves the HTML from content() to the .phtml? Or should I wait till this PR is merged and will open an additional PR later?

Do you mean #4943 ?

@Alkarex
Copy link
Member Author

Alkarex commented Dec 30, 2022

I have just spotted a couple of bugs in SimplePie that needs fixing for this PR to work with various encodings

@math-GH
Copy link
Contributor

math-GH commented Dec 30, 2022

@Alkarex am I allowed to add my changes, that moves the HTML from content() to the .phtml? Or should I wait till this PR is merged and will open an additional PR later?

Do you mean #4943 ?

#4943 was just the proof of concept.

I have just spotted a couple of bugs in SimplePie that needs fixing for this PR to work with various encodings

ok, then it will make sense to wait till this PR is done.

Encoding errors in enclosure links
}
else
{
$thumbnail['url'] = $this->sanitize($thumbnail['url'], SIMPLEPIE_CONSTRUCT_IRI);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thumbnail URL was previously returned unsanitized and unescaped, as opposed to other URLs returned by SimplePie

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: provide a base as parameter for absolutize_url to work. Check that other enclosure URLs are correct in this regard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Alkarex
Copy link
Member Author

Alkarex commented Dec 30, 2022

@math-GH Tests very much welcome already.

I need to look into another SimplePie potential issue with a lack of absolutize_url for the thumbnail URL, but I will continue a bit later

Alkarex added a commit to FreshRSS/simplepie that referenced this pull request Dec 31, 2022
SimplePie only returned the first enclosure when using the RSS 2.0 syntax for enclosures, instead of returning them all.

Please merge simplepie#768 first.

Downtream PR: FreshRSS/FreshRSS#4944
Alkarex added a commit to FreshRSS/simplepie that referenced this pull request Dec 31, 2022
Add missing sanitize and absolute_url on thumbnail URL.

SimplePie normally sanitizes all URLs, but the thumbnail URL was not processed.

Related to simplepie#768

Downstream PR FreshRSS/FreshRSS#4944
@math-GH
Copy link
Contributor

math-GH commented Jan 5, 2023

This PR breaks the multi enclosures of Atom feeds.

Example feed: https://github.com/math-GH/testfeed/blob/main/Atomfeed-with-enclosure.xml
This 1 article has 4 enclosure images.

main branch: all 4 images will be shown
this PR: only 1 image will be shown

@Alkarex
Copy link
Member Author

Alkarex commented Jan 5, 2023

This PR breaks the multi enclosures of Atom feeds.

Example feed: https://github.com/math-GH/testfeed/blob/main/Atomfeed-with-enclosure.xml This 1 article has 4 enclosure images.

main branch: all 4 images will be shown this PR: only 1 image will be shown

@math-GH This is because the HTML content already contains those images, in order to avoid duplicates (cf. #1668 )

https://github.com/FreshRSS/FreshRSS/pull/4944/files#diff-318f03d8ac66816058c42f26b8ed4c5ff71306b6250d7096ad25e0a15813e8efR125-R127

The four enclosures are correctly stored in database and provided again e.g. via API or RSS view

{
	"enclosures": [{
			"url": "https://0.gravatar.com/avatar/347dfeae267d8d9b7c8bf959b3b97baa?s=96&d=identicon&r=G",
			"title": "tomsgedankenblog",
			"medium": "image"
		}, {
			"url": "https://agileverwaltungorg.files.wordpress.com/2022/10/grafik-4.png?w=605",
			"medium": "image"
		}, {
			"url": "https://agileverwaltungorg.files.wordpress.com/2022/10/grafik-3.png?w=605",
			"medium": "image"
		}, {
			"url": "https://agileverwaltungorg.files.wordpress.com/2022/10/grafik-5.png?w=605",
			"medium": "image"
		}
	]
}

Avoid generating all enclosures if not used
Clients are typically not showing the enclosures to the users (tested with News+, FeedMe, Readrops, Fluent Reader Lite)
@math-GH
Copy link
Contributor

math-GH commented Jan 6, 2023

@Alkarex That is a great improvement ❤️

@Alkarex
Copy link
Member Author

Alkarex commented Jan 6, 2023

Let's give it a try 🚀

@Alkarex Alkarex merged commit 8f9c414 into FreshRSS:edge Jan 6, 2023
@Alkarex Alkarex deleted the better-enclosures branch January 6, 2023 18:54
@math-GH
Copy link
Contributor

math-GH commented Jan 6, 2023

This PR breaks the multi enclosures of Atom feeds.
Example feed: https://github.com/math-GH/testfeed/blob/main/Atomfeed-with-enclosure.xml This 1 article has 4 enclosure images.
main branch: all 4 images will be shown this PR: only 1 image will be shown

@math-GH This is because the HTML content already contains those images, in order to avoid duplicates (cf. #1668 )

I do not know if it is a bug or improvement or a feature:
The thumbnail is now the blue avatar, that is the first enclosure.
What would make more sense?
To take the first enclosure as thumbnail? Or take the first image in the article?

@Alkarex
Copy link
Member Author

Alkarex commented Jan 8, 2023

I do not know if it is a bug or improvement or a feature: The thumbnail is now the blue avatar, that is the first enclosure. What would make more sense? To take the first enclosure as thumbnail? Or take the first image in the article?

@math-GH Compared to before, the sole difference should be that we now support the dedicated thumbnail RSS element, which we then use as first priority. Before this PR, we were already searching for the first image in enclosures and then in the HTML body.

mblaney pushed a commit to simplepie/simplepie that referenced this pull request Jan 12, 2023
* No URL Decode for enclosure links
This line was introduced in 8a5f601#diff-a710c236cc7775672edbe5840d8fb96b0e91b0d56eff709f6eace2da47c1fe43R349-R359

Seems to be a similar error than efb1d8e

Having an `urldecode` returns some potentially invalid URLs, while SimplePie does not do that for other returned URLs

Downstream PR FreshRSS/FreshRSS#4944

P.S. There are a few other bugs related to enclosure links, and I will try to address some of them in distinct PRs.

* Add tests

* Simplify tests

* Whitespace

* PHP 7.2 compat

* Update tests/Unit/EnclosureTest.php

Co-authored-by: Jan Tojnar <jtojnar@gmail.com>

Co-authored-by: Jan Tojnar <jtojnar@gmail.com>
mblaney pushed a commit to simplepie/simplepie that referenced this pull request Jan 12, 2023
* Sanitize thumbnail URL
Add missing sanitize and absolute_url on thumbnail URL.

SimplePie normally sanitizes all URLs, but the thumbnail URL was not processed.

Related to #768

Downstream PR FreshRSS/FreshRSS#4944

* Update tests syntax
mblaney pushed a commit to simplepie/simplepie that referenced this pull request Jan 19, 2023
* No URL Decode for enclosure links
This line was introduced in 8a5f601#diff-a710c236cc7775672edbe5840d8fb96b0e91b0d56eff709f6eace2da47c1fe43R349-R359

Seems to be a similar error than efb1d8e

Having an `urldecode` returns some potentially invalid URLs, while SimplePie does not do that for other returned URLs

Downstream PR FreshRSS/FreshRSS#4944

P.S. There are a few other bugs related to enclosure links, and I will try to address some of them in distinct PRs.

* Add tests

* Simplify tests

* Fix case of multiple RSS2.0 enclosures
SimplePie only returned the first enclosure when using the RSS 2.0 syntax for enclosures, instead of returning them all.

Please merge #768 first.

Downtream PR: FreshRSS/FreshRSS#4944

* Whitespace

* PHP 7.2 compat

* PHP 7.2 compat

* Update tests/Unit/EnclosureTest.php

Co-authored-by: Artur Weigandt <Art4@users.noreply.github.com>

* Update tests/Unit/EnclosureTest.php

Co-authored-by: Jan Tojnar <jtojnar@gmail.com>

* Update src/Item.php

Co-authored-by: Jan Tojnar <jtojnar@gmail.com>

* Update tests syntax

Co-authored-by: Artur Weigandt <Art4@users.noreply.github.com>
Co-authored-by: Jan Tojnar <jtojnar@gmail.com>
cjchung pushed a commit to cjchung/FreshRSS that referenced this pull request Jan 20, 2023
* Better enclosures
#fix FreshRSS#4702
Improvement of FreshRSS#2898

* A few fixes

* Better enclosure titles

* Improve thumbnails

* Implement thumbnail for HTML+XPath

* Avoid duplicate enclosures
#fix FreshRSS#1668

* Fix regex

* Add basic support for media:credit
And use <figure> for enclosures

* Fix link encoding + simplify code

* Fix some SimplePie bugs
Encoding errors in enclosure links

* Remove debugging syslog

* Remove debugging syslog

* SimplePie fix multiple RSS2 enclosures
#fix FreshRSS#4974

* Improve thumbnails

* Performance with yield
Avoid generating all enclosures if not used

* API keep providing enclosures inside content
Clients are typically not showing the enclosures to the users (tested with News+, FeedMe, Readrops, Fluent Reader Lite)

* Lint

* Fix API output enclosure

* Fix API content strcut

* API tolerate enclosures without a type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Enclosure shows only 1 image of many [Feature] Enclosures as part of greader API Duplicate images

2 participants