-
-
Notifications
You must be signed in to change notification settings - Fork 217
Generate HTML episode description only when needed. #1246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I like the episode art url. gpodder-core has also added it, downloading the image when downloading the episode, deleting it when deleting the episode. They use it for the embedded player, though: useless to us. Nevermind: not the subject of this PR. I used to be against overriding the description when the author has taken the time to make it different than the html description. But I can't find examples anymore so I'm willing to let it go. Is performance OK when searching now that the text_description is generated for each keypress (and also for displaying when display description is enabled) ? If OK, no objection. As there is a schema change, shall we also add the chapters string field so we can add podlove chapters support in gpodder-core since 2014 later? This could go in a different PR but should be in the same release. Last missing field from gpodder-core episode model would be the subtitle, but I don't really see the point of it. What do you think. |
elelay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
Please see previous comment. At least change the column name in SQLite.
Code seems fine but I haven't run the branch yet.
|
Field renamed.
I've been using a patch since August that does exactly that, and it displays the image in text and html shownotes, as well as in the episode tooltips. Just need to find some time to fix a couple minor issues and add settings to disable it.
http://feeds.feedburner.com/se-radio stores the same feed description in the text description for every episode, and each html description is unique per episode. There could be feeds that do the same but reverse the fields. The problem is that episode list and EQL only used the text description and shownotes used the html description unless it was empty. Shownotes, EQL and episode list should use the same description for consistency, so it doesn't make sense to keep both around if we are going to prioritize one over the other.
I actually didn't benchmark EQL and episode list performance, oops! Shownotes increased from 1.5μs to 3.8μs for html and 137μs to convert text to html. I figured this 91x increase was acceptable since shownotes have ~3ms of additional overhead. Of course rapidly up/down arrowing through episode list would be slower with shownotes open. The EQL and episode list went from 1.5μs to 78-105μs (text or html stripping) which is a 52-70x increase amplified by every episode in the database. For 1000 episodes, it would take 76-103ms longer (for me) to query the episode list per key press, and this could be much greater on a slower device. Refreshing the episode list would be almost the same speed for text descriptions because one_line_description() is now 74μs faster. I had text_description() strip the text description because youtube-dl and soundcloud did, but model.py didn't. I now see that podcastparser strips it for model.py. Not stripping in text_description() and html_description() will reduce text->html in shownotes from 137μs to 65μs, and EQL and episode list from 78-105μs to 3.8-105μs (still 2.5-70x). This would make episode list refreshing 74μs faster per text description episode than it is without this PR (after I remove stripping). So I think stripping the html description and memoizing it in the text description is needed.
I don't know what the chapters or subtitle fields would do, but I'd say add them to the database in this PR if they can be fully implemented before release. |
|
About chapters: having them in the database from podcastparser is a first step and quite easy: json.dumps them on feed update. Then:
|
e10bbcc to
41a369a
Compare
|
The text description is now cached at startup and during feed updates. EQL and episode list refresh is the same speed as it was before, although episode list might be bit faster due to it no longer stripping the description. The shownotes will take about 2% longer to update. What database fields are needed to store chapters? |
|
See branch auouymous-ondemand-html-description for 2 commits for chapters+subtitles in addition to your PR. |
|
See https://feeds.metaebene.me/cre/m4a for a feed with chapters |
elelay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
41a369a to
564200e
Compare
|
I pushed your changes, and the subtitle and chapters are working for me on https://feeds.metaebene.me/cre/m4a episodes (with my debug extension). As for subtitle, |
|
Let's have a look at subtitles in our subscriptions:
more examples: |
|
http://feeds.themoth.org/themothpodcast
http://feeds.feedburner.com/se-radio, which has an HTML episode description and episode text description contains the channel description, puts the channel name in subtitle. :) |
8fe0bb8 to
fec6092
Compare
|
Rebased and applied SQL fix. |
fec6092 to
6c2bd0b
Compare
|
ok to drop subtitle then. Can you make it in this branch? |
@elelay What do you mean by that? |
|
ah, sorry it wasn't clear. |
Co-authored-by: Eric Le Lay <elelay@macports.org>
PR gpodder#1094 generated an HTML description for any episode that lacked one. That however increased the database size (almost double in worst case) because it was storing both text and html versions of each description. This fixes that by storing the episode thumbnail URL in the database and generating the HTML description only when shownotes are drawn. The text description is now cleared for episodes with an HTML description. This further reduces database size for feeds that provide both. It also fixes an issue for feeds that provide different text and HTML descriptions, because the short description would show the text description and shownotes would show the HTML description. And EQL only searched the text descriptions, which might not match what the user sees in the shownotes.
6c2bd0b to
0acaee4
Compare
|
Subtitle removed. |
|
Merged, thanks! |
WARNING: This patch changes the database schema and version. A backup will be created in your gPodder directory the first time you use this patch on an existing database.
PR #1094 generated an HTML description for any episode that lacked one. That however increased the database size (almost double in worst case) because it was storing both text and html versions of each description. This fixes that by storing the episode thumbnail URL in the database and generating the HTML description only when shownotes are drawn.
The text description is now cleared for episodes with an HTML description. This further reduces database size for feeds that provide both. It also fixes an issue for feeds that provide different text and HTML descriptions, because the short description would show the text description and shownotes would show the HTML description. And EQL only searched the text descriptions, which might not match what the user sees in the shownotes.