Skip to content

News downloader atom support#2736

Merged
Frenzie merged 15 commits intokoreader:masterfrom
mwoz123:news-atom-support
May 8, 2017
Merged

News downloader atom support#2736
Frenzie merged 15 commits intokoreader:masterfrom
mwoz123:news-atom-support

Conversation

@mwoz123
Copy link
Copy Markdown
Contributor

@mwoz123 mwoz123 commented Apr 8, 2017

News downloader - support for Atom.

@houqp
Copy link
Copy Markdown
Member

houqp commented Apr 11, 2017

The new atom changes look good to me 👍 @mwoz123 I should be able to help finish up the setting part this week, which will be my last nitpick before merging the plugin. Thank you for being patient with my slow review process ;P

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented Apr 11, 2017

Can we merge the first PR (#2592) as first version, and keep working here?

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented Apr 19, 2017

I merged from master, resolved conflicts.
Currently on master there's a bug that disallows to use upper menu, so I was not able to test it after merge.

Btw don't know why the list of commits includes also commits from previous branch.
If you know how to fix it please let me know otherwise we'll have to deal with it, but at least diff looks fine..

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Apr 19, 2017

Should be fine when squashed, but I personally tend to prefer git pull -r to rebase on top of master.

@houqp
Copy link
Copy Markdown
Member

houqp commented Apr 22, 2017

@mwoz123 the easiest way to fix this is to do a clean checkout of the current koreader master, then cherry-pick your last atom commit: a50c8d7. Then do a force push to your news-atom-support branch.

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Apr 22, 2017

To be clear, how I'd do that is something along these lines. (If I'm doing something inefficiently or whatever, please tell me. ;-) )

git checkout master
# I've got this set up with a remote called upstream, see http://koreader.rocks/doc/topics/Collaborating.html
# git pull git@github.com:koreader/koreader.git will also work otherwise
git pull upstream master
# make a new, clean branch based on most recent upstream
git checkout -b new_atom_stuff
# now cherry pick the commit into this branch
git cherry-pick a50c8d7
# up to now it was all harmless, but this one's dangerous!
git push -f origin new_atom_stuff:news-atom-support

PS I'd say that usually it's much easier to do git pull -r upstream master instead. This is kind of a special case.

@houqp
Copy link
Copy Markdown
Member

houqp commented Apr 22, 2017

This can be simplified to:

git fetch upstream
git checkout -b upstream/master new_atom_stuff
git cherry-pick a50c8d7
git push -f origin new_atom_stuff:news-atom-support

If your local master branch is not clean, doing a git pull upstream master might cause extra headaches ;)

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Apr 22, 2017

That's not the same thing. Imo you're wrong even though you're also right. ;p I want to make sure master is up to date anyway.

In the extremely unlikely event that you don't hav3 a clean master you just reset hard or abort rebase (depending), but you always keep master clean. That's what branches are for. In my opinion anyway. :p

@houqp
Copy link
Copy Markdown
Member

houqp commented Apr 22, 2017

nowadays, I try to avoid developing on master branch completely ;) all my local branches are temporary branches that can be deleted anytime. I used to do this houqp-master thing to avoid tainting my local master branch. And later I figured if I only use local master as a clean starting point for other branches, then it's faster for me to just also use upstream/master. That way, i don't need to worry about synchronization with upstream master at all.

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Apr 23, 2017

That's a very interesting point. I'll have to give it some thought with regard to Git in general vs GitHub-specific Git workflow.

@mwoz123 mwoz123 force-pushed the news-atom-support branch from 7e665dc to 8049564 Compare April 23, 2017 08:49
logger.dbg("NewsDownloader: Creating init configuration")
FFIUtil.copyFile(FFIUtil.joinPath(self.path, FEED_CONFIG_FILE),
FEED_CONFIG_PATH)
end
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those lines seems to be merged incorrectly. Will have to remove them next commit.

if not feeds.rss or not feeds.rss.channel
or not feeds.rss.channel.title or not feeds.rss.channel.item then

local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lua does not use ;.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick unrelated to the code logic, but is_rss and is_valid_rss are different things. is_valid_atom or is_valid_rss means something like coming through https://validator.w3.org/feed/ unscathed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It means it's valid for this code, and shouldn't cause any exception on further processing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's kind of my point. :-P

or not feeds.rss.channel.title or not feeds.rss.channel.item then

local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item;
local is_valid_atom = feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's already detected as rss, then this check is redundant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just to it has structure code's expecting, and so shouldn't cause any problems during further processing.

end

for index, feed in pairs(feeds.feed.entry) do
if index -1 == limit then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is limit defined?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's missing. I'll have to fix it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. limit is in method signature.

if not feeds.rss or not feeds.rss.channel
or not feeds.rss.channel.title or not feeds.rss.channel.item then

local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick unrelated to the code logic, but is_rss and is_valid_rss are different things. is_valid_atom or is_valid_rss means something like coming through https://validator.w3.org/feed/ unscathed.


function NewsDownloader:commonFeedProcess(index, feed)

local news_dl_path = ("%s%s%s"):format(feed_output_dir,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging issue? Should be only four spaces. :-)

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented Apr 23, 2017

I've decided I don't like git checkout upstream/master -b some_branch even within a GitHub workflow. The reason?

# whoops, first push
$ git push
fatal: The upstream branch of your current branch does not match
the name of your current branch.  To push to the upstream branch
on the remote, use

    git push upstream HEAD:master

To push to the branch of the same name on the remote, use

    git push upstream luacheck

# careful, can't just copy the helpful message like you could if you branched from master!
$ git push origin luacheck

I actually forgot, but that's the reason why I like to stick with where things are going rather than where they're coming from.

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented Apr 23, 2017

Guys there're so many problems with that merge conflicts that I thought maybe we could now use specialized external library for RSS/Atom?

Previously, before creating first PR, I tried to used them but I wasn't able to get my code working.
Maybe I'm too less familiar with Lua:(

RSS/Atom libraries I found:

If I remember right, previously with one of them I wasn't able to import lua/so file, and other one I wasn't able to implement in lua.

@houqp
Copy link
Copy Markdown
Member

houqp commented Apr 24, 2017

@mwoz123 it's less than 50 lines of changes, it should not be that hard to resolve the conflicts. What issues are you running into right now? Have you tried the cherry-pick commands we suggested earlier? It's a good chance for you to learn how to resolve conflicts in git. Once you understand the concept, it would be really easy going forward.

If your code already works, there is no point introducing another heavy dependency for couple lines of code. Unless yours is missing some major features.

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented Apr 24, 2017

Yes I did cherry pick on master & force push. But got confused on some of conflicts in your implementation.

As rss is a standard that is less formal (although I used most frequent rss 2.0 format) it might happen that a Web site will have unsupported by us rss format variation.
I don't know how real is that scenario.

I thought that maybe specialized lib would handle that..

What you think about it?

FFIUtil.copyFile(FFIUtil.joinPath(self.path, FEED_CONFIG_FILE),
FEED_CONFIG_PATH)
end
FEED_CONFIG_PATH = NEWS_DL_DIR .. FEED_CONFIG_FILE
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

four spaces :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also generally ALL_CAPS means globals while I assume these are meant to be locals. I don't have time to look at in detail but they should probably be lowercase?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

four spaces :) I used tab there. Do you mean I should replace that with (doubled) 4 spaces?
It looks fine in Gedit.

In Javascript there something like "Minify".
Maybe there also something like that for lua/travis?
As travis is written in ruby I bet there some tools for that.
It would make more sense to use a tool than manual removal tailing whitespaces, changing tabs to 4 spaces etc? (But I guess it's a good thema for new topic)

What do you want me to do with that? replace the tab with spaces?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean I should replace that with (doubled) 4 spaces?

Yes, the style guide calls for four spaces. I happen to be in the tab camp myself, but keeping things consistent really helps maintainability largely no matter what the standard is.

I apologize about the all caps. My remark was slightly contradictory to the style guide https://github.com/koreader/koreader-base/wiki/Coding-style#naming The distinction between constants and variables isn't quite the same as between globals and locals, although it probably holds up here?

It would make more sense to use a tool than manual removal tailing whitespaces, changing tabs to 4 spaces etc? (But I guess it's a good thema for new topic)

Possibly, but you have to be very careful not to sweep actual code logic errors under the rug. It's better to warn about it and have a person decide what to do with it imo. For personal use it should be simple enough to find a script such as this one: http://lua-users.org/wiki/SourceCodeFormatter (and of course you can also run luacheck locally)

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented May 4, 2017

Hopefully I fixed all review change requests. If I overlooked something please notify me.

Question:
local news_download_dir_name = G_reader_settings:readSetting("news_download_dir_name")
Would that be the right way to make this variable editable on Android?
(as only settings.reader.lua is on writeable android location)

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented May 4, 2017

Wouldn't you already have used something like LuaSettings:open(DataStorage:getSettingsDir().."/newsdownloader_settings.lua") previously?

It's probably better if each plugin keeps its settings contained without "polluting" the base program.

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented May 4, 2017

Yes, we had all news downloader variables in external setting (lua) file but there was a request to merge it to one main.lua

What should I do then? How you want that to be done?

  • Re-externalize all variables or just that one to NewsDownloader config lua?
  • put that to config file to settings.reader.lua?
  • or other way?

if not feeds.rss or not feeds.rss.channel
or not feeds.rss.channel.title or not feeds.rss.channel.item then

local is_valid_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still says is_valid_rss (and atom) instead of just is_rss.

local is_valid_atom = feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link;

if not is_valid_rss and not is_valid_atom then
logger.info('NewsDownloader: Got invalid feeds', feeds)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And these are really unrecognized feeds. Just because it's an unsupported, obscure form of RDF doesn't mean it's invalid.

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented May 5, 2017

That's doing what I said, not using global settings. :-)

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented May 5, 2017

I've found a cool solution to allow editable news folder path.
Please check this commit 611879a

Full solution is ready. Please review.

@KenMaltby
Copy link
Copy Markdown

Not sure this file handling/acquiring function should have its download dir setting in the "reader" defaults.

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented May 5, 2017

The settings module can be used by any plugin: http://koreader.rocks/doc/modules/luasettings.html

Like I said above, you can call it using something like LuaSettings:open(DataStorage:getSettingsDir().."/newsdownloader_settings.lua")

So instead of

local custom_news_path = G_reader_settings:readSetting(custom_news_path_var_name)

You do something like

local newsreader_settings = LuaSettings:open(
        DataStorage:getSettingsDir().."/newsdownloader_settings.lua")
local custom_news_path = newsreader_settings:readSetting(custom_news_path_var_name)

That's basically all the global reader settings do, except when something doesn't need to be global it typically shouldn't be for usability, maintenance, as well as performance reasons.

My other main remark is that instead of displaying an info message it would be more user-friendly to call a graphical folder picker. See an example here:

text = gettext("Set download directory"),
callback = function()
require("ui/downloadmgr"):new{
title = gettext("Choose download directory"),
onConfirm = function(path)
logger.dbg("set download directory to", path)
G_reader_settings:saveSetting("download_dir", path)
UIManager:nextTick(function()
UIManager:close(self.download_dialog)
self:createNewDownloadDialog(path, buttons)
UIManager:show(self.download_dialog)
end)
end,
}:chooseDir()

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented May 5, 2017

It would be too big change.
"Own download folder" doesn't have to be part of PR Atom support.

I thought it could be simple, extra change, but it turns out it cannot.
So it was removed from this PR (reverted commit).
It will be done in next PR if I found time.

Please re-review.

Frenzie
Frenzie previously requested changes May 5, 2017
end
end
UIManager:show(InfoMessage:new{
text = T(_("Downloading finished. Could not process some feeds. Unsupported format in: %1"), unsupported_urls)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "Downloading news finished." to keep it consistent with the other message.

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented May 5, 2017

It basically looks good to me. However, I don't know exactly what @houqp was referring to when he said the following in #2736 (comment):

I should be able to help finish up the setting part this week, which will be my last nitpick before merging the plugin.

Was that already taken care of in the previous PR?

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented May 5, 2017

I don't know what was on his mind.
@houqp ?

@z3t0
Copy link
Copy Markdown

z3t0 commented May 6, 2017

Sorry if this is a silly question but what does this merge add? (I'm not familiar enough with the code base to follow the commits). As far as I can tell, rss/news feature already esists?

@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented May 6, 2017

Atom is another (imo mostly better) syndication format.

@Frenzie Frenzie dismissed their stale review May 6, 2017 09:08

It's been addressed.

self:processAtom(feeds, limit);
else
self:processRSS(feeds, limit);
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole block can be optimized to the following to remove redundant if comparisons:

if feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item then
    self:processRSS(feeds, limit)
elseif feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link then
    self:processAtom(feeds, limit)
else
    table.insert(unsupported_feeds_urls, url)
    return
end

Also notice ; is not needed in LUA.

Copy link
Copy Markdown
Member

@Frenzie Frenzie May 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I thought the semicolons were already gone. Must've been some other ones.

@houqp
Copy link
Copy Markdown
Member

houqp commented May 7, 2017

Sorry for the late reply, I have been really busy in the last couple weeks.

I should be able to help finish up the setting part this week, which will be my last nitpick before merging the plugin.

When I made that comment, the other rss PR hasn't been merged yet. I have since sent a PR to that branch for the setting support. So it's already done in the PR.

Feel free to merge this after my last review is resolved.

return
end

local unsupported_feeds_urls = {};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another semicolon.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will remove all if it's a problem..

If you don't want any semicolons in code maybe you should think about lua check rule for that..

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

luacheck only warns about semicolons that cause syntax errors such as do … end;. It's never been an issue before. I'm not sure it's worth the extra effort just because you like to sneak in semicolons. :-P

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwoz123 i agree a lot of this styling issues should be part of the static check process. It helps make the feedback loop shorter. I actually opened a feature request with luacheck on this topic: mpeterv/luacheck#84

If you are interested in adding this, that would be amazing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I can see it's written in lua, so currently it's not language I can say I'm familiar with.

return
end

local is_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semicolon

end

local is_rss = feeds.rss and feeds.rss.channel and feeds.rss.channel.title and feeds.rss.channel.item;
local is_atom = feeds.feed and feeds.feed.title and feeds.feed.entry.title and feeds.feed.entry.link;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semicolon

FILE_EXTENSION)
logger.dbg("NewsDownloader: News file will be stored to :", news_dl_path)
http.request({ url = url, sink = ltn12.sink.file(io.open(news_dl_path, 'w')), })
self:commonFeedProcess(feed, feed_output_dir);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more

if index -1 == limit then
break
end
self:commonFeedProcess(feed, feed_output_dir);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semicolon

@mwoz123
Copy link
Copy Markdown
Contributor Author

mwoz123 commented May 8, 2017

Done.

@Frenzie Frenzie merged commit 4eb8664 into koreader:master May 8, 2017
@Frenzie
Copy link
Copy Markdown
Member

Frenzie commented May 8, 2017

Thanks!

@mwoz123 mwoz123 deleted the news-atom-support branch October 28, 2017 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants