News downloader atom support#2736
Conversation
|
The new atom changes look good to me 👍 @mwoz123 I should be able to help finish up the setting part this week, which will be my last nitpick before merging the plugin. Thank you for being patient with my slow review process ;P |
|
Can we merge the first PR (#2592) as first version, and keep working here? |
|
I merged from master, resolved conflicts. Btw don't know why the list of commits includes also commits from previous branch. |
|
Should be fine when squashed, but I personally tend to prefer |
|
@mwoz123 the easiest way to fix this is to do a clean checkout of the current koreader master, then cherry-pick your last atom commit: |
|
To be clear, how I'd do that is something along these lines. (If I'm doing something inefficiently or whatever, please tell me. ;-) ) git checkout master
# I've got this set up with a remote called upstream, see http://koreader.rocks/doc/topics/Collaborating.html
# git pull git@github.com:koreader/koreader.git will also work otherwise
git pull upstream master
# make a new, clean branch based on most recent upstream
git checkout -b new_atom_stuff
# now cherry pick the commit into this branch
git cherry-pick a50c8d7
# up to now it was all harmless, but this one's dangerous!
git push -f origin new_atom_stuff:news-atom-supportPS I'd say that usually it's much easier to do |
|
This can be simplified to: git fetch upstream
git checkout -b upstream/master new_atom_stuff
git cherry-pick a50c8d7
git push -f origin new_atom_stuff:news-atom-supportIf your local master branch is not clean, doing a |
|
That's not the same thing. Imo you're wrong even though you're also right. ;p I want to make sure master is up to date anyway. In the extremely unlikely event that you don't hav3 a clean master you just reset hard or abort rebase (depending), but you always keep master clean. That's what branches are for. In my opinion anyway. :p |
|
nowadays, I try to avoid developing on master branch completely ;) all my local branches are temporary branches that can be deleted anytime. I used to do this |
|
That's a very interesting point. I'll have to give it some thought with regard to Git in general vs GitHub-specific Git workflow. |
7e665dc to
8049564
Compare
| logger.dbg("NewsDownloader: Creating init configuration") | ||
| FFIUtil.copyFile(FFIUtil.joinPath(self.path, FEED_CONFIG_FILE), | ||
| FEED_CONFIG_PATH) | ||
| end |
There was a problem hiding this comment.
Those lines seems to be merged incorrectly. Will have to remove them next commit.
| if not feeds.rss or not feeds.rss.channel | ||
| or not feeds.rss.channel.title or not feeds.rss.channel.item then | ||
|
|
||
| local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item; |
There was a problem hiding this comment.
Nitpick unrelated to the code logic, but is_rss and is_valid_rss are different things. is_valid_atom or is_valid_rss means something like coming through https://validator.w3.org/feed/ unscathed.
There was a problem hiding this comment.
No. It means it's valid for this code, and shouldn't cause any exception on further processing.
| or not feeds.rss.channel.title or not feeds.rss.channel.item then | ||
|
|
||
| local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item; | ||
| local is_valid_atom = feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link; |
There was a problem hiding this comment.
If it's already detected as rss, then this check is redundant.
There was a problem hiding this comment.
It was just to it has structure code's expecting, and so shouldn't cause any problems during further processing.
| end | ||
|
|
||
| for index, feed in pairs(feeds.feed.entry) do | ||
| if index -1 == limit then |
There was a problem hiding this comment.
Right, it's missing. I'll have to fix it.
There was a problem hiding this comment.
fixed. limit is in method signature.
| if not feeds.rss or not feeds.rss.channel | ||
| or not feeds.rss.channel.title or not feeds.rss.channel.item then | ||
|
|
||
| local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item; |
There was a problem hiding this comment.
Nitpick unrelated to the code logic, but is_rss and is_valid_rss are different things. is_valid_atom or is_valid_rss means something like coming through https://validator.w3.org/feed/ unscathed.
|
|
||
| function NewsDownloader:commonFeedProcess(index, feed) | ||
|
|
||
| local news_dl_path = ("%s%s%s"):format(feed_output_dir, |
There was a problem hiding this comment.
Merging issue? Should be only four spaces. :-)
|
I've decided I don't like # whoops, first push
$ git push
fatal: The upstream branch of your current branch does not match
the name of your current branch. To push to the upstream branch
on the remote, use
git push upstream HEAD:master
To push to the branch of the same name on the remote, use
git push upstream luacheck
# careful, can't just copy the helpful message like you could if you branched from master!
$ git push origin luacheckI actually forgot, but that's the reason why I like to stick with where things are going rather than where they're coming from. |
|
Guys there're so many problems with that merge conflicts that I thought maybe we could now use specialized external library for RSS/Atom? Previously, before creating first PR, I tried to used them but I wasn't able to get my code working. RSS/Atom libraries I found:
If I remember right, previously with one of them I wasn't able to import lua/so file, and other one I wasn't able to implement in lua. |
|
@mwoz123 it's less than 50 lines of changes, it should not be that hard to resolve the conflicts. What issues are you running into right now? Have you tried the cherry-pick commands we suggested earlier? It's a good chance for you to learn how to resolve conflicts in git. Once you understand the concept, it would be really easy going forward. If your code already works, there is no point introducing another heavy dependency for couple lines of code. Unless yours is missing some major features. |
|
Yes I did cherry pick on master & force push. But got confused on some of conflicts in your implementation. As rss is a standard that is less formal (although I used most frequent rss 2.0 format) it might happen that a Web site will have unsupported by us rss format variation. I thought that maybe specialized lib would handle that.. What you think about it? |
| FFIUtil.copyFile(FFIUtil.joinPath(self.path, FEED_CONFIG_FILE), | ||
| FEED_CONFIG_PATH) | ||
| end | ||
| FEED_CONFIG_PATH = NEWS_DL_DIR .. FEED_CONFIG_FILE |
There was a problem hiding this comment.
Also generally ALL_CAPS means globals while I assume these are meant to be locals. I don't have time to look at in detail but they should probably be lowercase?
There was a problem hiding this comment.
four spaces :) I used tab there. Do you mean I should replace that with (doubled) 4 spaces?
It looks fine in Gedit.
In Javascript there something like "Minify".
Maybe there also something like that for lua/travis?
As travis is written in ruby I bet there some tools for that.
It would make more sense to use a tool than manual removal tailing whitespaces, changing tabs to 4 spaces etc? (But I guess it's a good thema for new topic)
What do you want me to do with that? replace the tab with spaces?
There was a problem hiding this comment.
Do you mean I should replace that with (doubled) 4 spaces?
Yes, the style guide calls for four spaces. I happen to be in the tab camp myself, but keeping things consistent really helps maintainability largely no matter what the standard is.
I apologize about the all caps. My remark was slightly contradictory to the style guide https://github.com/koreader/koreader-base/wiki/Coding-style#naming The distinction between constants and variables isn't quite the same as between globals and locals, although it probably holds up here?
It would make more sense to use a tool than manual removal tailing whitespaces, changing tabs to 4 spaces etc? (But I guess it's a good thema for new topic)
Possibly, but you have to be very careful not to sweep actual code logic errors under the rug. It's better to warn about it and have a person decide what to do with it imo. For personal use it should be simple enough to find a script such as this one: http://lua-users.org/wiki/SourceCodeFormatter (and of course you can also run luacheck locally)
|
Hopefully I fixed all review change requests. If I overlooked something please notify me. Question: |
|
Wouldn't you already have used something like It's probably better if each plugin keeps its settings contained without "polluting" the base program. |
|
Yes, we had all news downloader variables in external setting (lua) file but there was a request to merge it to one main.lua What should I do then? How you want that to be done?
|
| if not feeds.rss or not feeds.rss.channel | ||
| or not feeds.rss.channel.title or not feeds.rss.channel.item then | ||
|
|
||
| local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item; |
There was a problem hiding this comment.
It still says is_valid_rss (and atom) instead of just is_rss.
| local is_valid_atom = feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link; | ||
|
|
||
| if not is_valid_rss and not is_valid_atom then | ||
| logger.info('NewsDownloader: Got invalid feeds', feeds) |
There was a problem hiding this comment.
And these are really unrecognized feeds. Just because it's an unsupported, obscure form of RDF doesn't mean it's invalid.
|
That's doing what I said, not using global settings. :-) |
|
I've found a cool solution to allow editable news folder path. Full solution is ready. Please review. |
|
Not sure this file handling/acquiring function should have its download dir setting in the "reader" defaults. |
|
The settings module can be used by any plugin: http://koreader.rocks/doc/modules/luasettings.html Like I said above, you can call it using something like So instead of local custom_news_path = G_reader_settings:readSetting(custom_news_path_var_name)You do something like local newsreader_settings = LuaSettings:open(
DataStorage:getSettingsDir().."/newsdownloader_settings.lua")
local custom_news_path = newsreader_settings:readSetting(custom_news_path_var_name)That's basically all the global reader settings do, except when something doesn't need to be global it typically shouldn't be for usability, maintenance, as well as performance reasons. My other main remark is that instead of displaying an info message it would be more user-friendly to call a graphical folder picker. See an example here: koreader/frontend/ui/widget/opdsbrowser.lua Lines 574 to 587 in cca9c09 |
|
It would be too big change. I thought it could be simple, extra change, but it turns out it cannot. Please re-review. |
| end | ||
| end | ||
| UIManager:show(InfoMessage:new{ | ||
| text = T(_("Downloading finished. Could not process some feeds. Unsupported format in: %1"), unsupported_urls) |
There was a problem hiding this comment.
I'd say "Downloading news finished." to keep it consistent with the other message.
|
It basically looks good to me. However, I don't know exactly what @houqp was referring to when he said the following in #2736 (comment):
Was that already taken care of in the previous PR? |
|
I don't know what was on his mind. |
|
Sorry if this is a silly question but what does this merge add? (I'm not familiar enough with the code base to follow the commits). As far as I can tell, rss/news feature already esists? |
|
Atom is another (imo mostly better) syndication format. |
| self:processAtom(feeds, limit); | ||
| else | ||
| self:processRSS(feeds, limit); | ||
| end |
There was a problem hiding this comment.
This whole block can be optimized to the following to remove redundant if comparisons:
if feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item then
self:processRSS(feeds, limit)
elseif feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link then
self:processAtom(feeds, limit)
else
table.insert(unsupported_feeds_urls, url)
return
endAlso notice ; is not needed in LUA.
There was a problem hiding this comment.
Huh, I thought the semicolons were already gone. Must've been some other ones.
|
Sorry for the late reply, I have been really busy in the last couple weeks.
When I made that comment, the other rss PR hasn't been merged yet. I have since sent a PR to that branch for the setting support. So it's already done in the PR. Feel free to merge this after my last review is resolved. |
| return | ||
| end | ||
|
|
||
| local unsupported_feeds_urls = {}; |
There was a problem hiding this comment.
Ok. I will remove all if it's a problem..
If you don't want any semicolons in code maybe you should think about lua check rule for that..
There was a problem hiding this comment.
luacheck only warns about semicolons that cause syntax errors such as do … end;. It's never been an issue before. I'm not sure it's worth the extra effort just because you like to sneak in semicolons. :-P
There was a problem hiding this comment.
@mwoz123 i agree a lot of this styling issues should be part of the static check process. It helps make the feedback loop shorter. I actually opened a feature request with luacheck on this topic: mpeterv/luacheck#84
If you are interested in adding this, that would be amazing.
There was a problem hiding this comment.
As I can see it's written in lua, so currently it's not language I can say I'm familiar with.
| return | ||
| end | ||
|
|
||
| local is_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item; |
| end | ||
|
|
||
| local is_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item; | ||
| local is_atom = feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link; |
| FILE_EXTENSION) | ||
| logger.dbg("NewsDownloader: News file will be stored to :", news_dl_path) | ||
| http.request({ url = url, sink = ltn12.sink.file(io.open(news_dl_path, 'w')), }) | ||
| self:commonFeedProcess(feed, feed_output_dir); |
| if index -1 == limit then | ||
| break | ||
| end | ||
| self:commonFeedProcess(feed, feed_output_dir); |
|
Done. |
|
Thanks! |
News downloader - support for Atom.