-
-
Notifications
You must be signed in to change notification settings - Fork 217
Support Apple Podcasts urls. #1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
tpikonen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this works and looks fine, except for the small nitpick.
|
https://podcasts.apple.com/it/podcast/il-podcast-di-alessandro-barbero-lezioni-e/id1501956064 works fine. However, https://podcasts.apple.com/us/podcast/marvels-wastelanders-black-widow/id1600546725, a podcast I found on the front page of podcasts.apple.com, does not work. |
|
@auouymous that page doesn't load in the browser too, it is stuck on 'Connecting to Apple Podcasts'. Try also any random nonexistent number as id, you'd get the same landing page. With real podcasts, this happens when the rss feed is gone/unreachable, parts of the apple website ui may feature the podcast still, but the actual podcast page is delisted. In this case, the feed, found otherwise ( A bigger issue though, looking at similar ex-paywalled Marvel podcasts, might be something like Marvel's Wastelanders: Hawkeye:
All Wastelanders podcasts are folded into one on Spotify, the Marvel webpage advertises this exclusively for all storylines, but that feed doesn't have all episodes folded in, maybe they are mid-migration or this is some 'have to be logged in / subscribed in iTunes' type of deal. I strongly suspect this is a Marvel thing, I have been using the lookup endpoint via userscript on the iTunes webpage for years, haven't had any issues with (by now ~300 subscribed) podcasts from different publishers. Still, a gPodder user would see a valid 'Apple Podcasts' listing, copy link to gPodder, and it wouldn't work. The whole iTunes Search API is old (json string as txt/javascript, docs archived in 2017), not sure I can coerce it to give up more data. I could do a more granular json response check, (nonsense || valid json, empty results || valid json, no feedurl), log something more meaningful ('No feedUrl in response, is this an exclusive podcast?'), alter the docstring to match (...tries to resolve...). On failure, it would return the original url, though, which would fail later (html). EDIT: It seems they all use the same lookup endpoint, so predictably they all choke on the hawkeye url. Might be a one-off. |
|
Better error messages would be nice to have, and I think parse_apple_podcasts_url() should return None if it fails to parse the URL. The None check after parse_youtube_url() adds the original url to a failed list instead of queuing it, and then causing 'not found' or 'invalid html' errors. We know the original apple URL will always fail if not resolved, correct? The I also just tried subscribing to https://www.youtube.com/c/jsgdajfgjsadfjh and its exception causes no dialog to appear. So I will need to fix parse_youtube_url() to not throw exceptions or to catch them at the callsite and set error_message to str(e). Thoughts? |
Yes
If you look through the review of my (albeit flawed) original patch, besides the original url, None used to be valid return, and the caller's responsibility to handle (1, 2). This was considered wasteful in gpo cli because of passing None around, despite the youtube parser being hooked into gtkui/main.py in the same manner. After the reorder, gpo cli would return "invalid url" early, but gui would still silently fail, as it does now with the youtube parser - to prevent that the succeeding None check should be extended to report errors, I agree. Based on the review the function now always returns some url, None only if None is passed, which mirrors the youtube parser. If there's consensus whether this is better/worse, I can change the function accordingly (I lean towards returning None on error as well). I think the additional error message return seems out of scope for this PR. Caller functions would have to be changed to accomodate it, both parsers (
The parser could be case-insensitive too (so In my opinion, youtube.py in general needs refractoring/simplification. There are multiple references to the now dead gdata.* youtube api, and code paths that are hard to follow (to me, I guess). We have |
|
Changes:
gpo cli now reports all unsuccessfully parsed apple podcasts links with "Invalid url: %s", and doesn't process them further. The gui doesn't either; EDIT: accelerators / prefixes in normalize_feed_url() are not resolved for the error message: This shouldn't even be a None return, but a successful parse of The length criteria (more than 8 chars) should not apply to ( accelerator + id ), in fact I think id should have len(id) > 0 and no whitespace after str.strip() requirement only. |
|
This is not strictly related to the apple podcasts parser, but I believe the validating of prefixed urls in normalize_feed_url() should be corrected - 3-char soundcloud usernames exist (2-char too, but less common), as well as 4-character youtube usernames, both would fail as 'prefix' urls currently. As I see it, we can
This would allow sc:123, yt:abcd. Longer prefixes would have shorter allowed id_min_length (len(url) - len(prefix)). Empty prefixed urls with a sufficiently long prefix would resolve to empty patterns
This would allow prefix:id_min_length. It would keep min_length == 8 for non-prefixed urls, id_min_length would be consistent across prefixes of any length. id_min_length == 1 would mean non-empty id for a prefix. With more changes it would allow for processing on expansion (moving the |
Inspired by #1541.
This PR adds support for Apple Podcasts web links in the format of
https://podcasts.apple.com/COUNTRY/podcast/PODCAST_TITLE/idPODCAST_ID.utils.parse_apple_podcasts_url() opens
https://itunes.apple.com/lookup?id=PODCAST_IDand returns thefeedUrlfield from the json response, or the original url on error, for further processing.I also cleaned up and updated 'accelerators' in utils.normalize_feed_url(), fixed the test for feedburner, and added one for apple podcasts.