imdb_watchlist plugin fixed issue #2035#2050
Conversation
| entries = [] | ||
| for imdb_id in imdb_ids: | ||
| entry = Entry() | ||
| if not 'title' in json_data[imdb_id] or not 'primary' in json_data[imdb_id]['title'] or not 'href' in json_data[imdb_id]['title']['primary'] or not 'title' in json_data[imdb_id]['title']['primary']: |
| if 'type' in json_data[imdb_id]['title']: | ||
| imdb_type = json_data[imdb_id]['title']['type'] | ||
| else: | ||
| imdb_type = '' |
There was a problem hiding this comment.
You should just initialize this to an empty string at the start of the loop
| total_item_count = len(json_vars['list']['items']) | ||
| else: | ||
| total_item_count = 0 | ||
| if total_item_count == 0: |
There was a problem hiding this comment.
if not total_item_count is more pythonic
| if 'list' in json_vars and 'items' in json_vars['list']: | ||
| total_item_count = len(json_vars['list']['items']) | ||
| else: | ||
| total_item_count = 0 |
There was a problem hiding this comment.
Just initialize this to 0 above the if.
| page = self.fetch_page(task, url, params, headers) | ||
| # log.debug(page.text) | ||
| re_result = re.search('IMDbReactInitialState.push\((.+?)\);\n', page.text) | ||
| json_text = re_result.group(1) |
There was a problem hiding this comment.
This will crash if it doesn't find a match. Should be wrapped in a try-except and spit out a suitable error message informing the user that the plugin may be broken
There was a problem hiding this comment.
Thanks, cvium for the thorough code review!
I left line 121 out of the try-except, since the fetch_page function has its own code that handles HTTPErrors or wrong http status codes.
|
Thanks, cvium for the thorough code review! |
| params = {'ids': ','.join(imdb_ids)} | ||
| url = 'http://www.imdb.com/title/data' | ||
| try: | ||
| page = task.requests.get(url, params=params) |
There was a problem hiding this comment.
If you give it the argument raise_status=True it will raise an exception when the error code is 4xx, 5xx etc. I think that's what you're going for?
And fyi, Requests raises a RequestException, so catch that instead of HTTPError.
| if page.status_code != 200: | ||
| raise plugin.PluginError('Unable to get imdb title data. Html status code was: %d.' % page.status_code) | ||
| json_data = json.loads(page.text) | ||
| log.verbose('imdb list contains %d items' % len(json_data)) |
There was a problem hiding this comment.
All logging calls should use implicit string interpolation ie. the logger takes an arbitrary number of arguments that it will use for string interpolation. As an example, the above line should be log.verbose('imdb list contains %d items', len(json_data)) (notice the comma).
|
|
||
| items = soup.find_all(attrs={'data-item-id': True, 'class': 'list_item'}) | ||
| items = soup.find_all('div', class_='lister-item') | ||
| if len(items) == 0: |
There was a problem hiding this comment.
if not items is the pythonic way
| page = self.fetch_page(task, url, params, headers) | ||
| try: | ||
| re_result = re.search('IMDbReactInitialState.push\((.+?)\);\n', page.text) | ||
| json_text = re_result.group(1) |
There was a problem hiding this comment.
I don't think there's any need for an explicit variable here. You can just change the line below to json.loads(re_result.group(1)).
| raise plugin.PluginError( | ||
| 'Unable to get imdb list from imdb react widget.' + | ||
| ' Either the list is empty or the imdb parser plugin is broken.' + | ||
| ' Original error: %s.' % e.args[0]) |
There was a problem hiding this comment.
I think str(e) is better. You lose some information with e.args[0] (like what kind of exception it is). You will need to add from builtins import * # noqa pylint: disable=unused-import, redefined-builtin to the list of imports though for compatibility.
| return | ||
| imdb_ids = [] | ||
| for item in json_vars['list']['items']: | ||
| if is_valid_imdb_title_id(item['const']): |
There was a problem hiding this comment.
This will crash if const is not present. Better to use dict.get() in this case.
| try: | ||
| page = task.requests.get(url, params=params) | ||
| except HTTPError as e: | ||
| raise plugin.PluginError(e.args[0]) |
cvium
left a comment
There was a problem hiding this comment.
Overall it's pretty good. Just a few corrections needed.
|
I think I got it now. Also checked entire code again for non-pythonic mishaps ;-). |
| if 'all' not in config['type']: | ||
| title_types = [] | ||
| for title_type in config['type']: | ||
| title_types.append(TITLE_TYPE_MAP[title_type]) |
There was a problem hiding this comment.
Just a style suggestion, list comprehension fits here well:
title_types = [TITLE_TYPE_MAP[title_type] for title_type in config['type']| url = 'http://www.imdb.com/title/data' | ||
| page = self.fetch_page(task, url, params, headers) | ||
| try: | ||
| json_data = json.loads(page.text) |
There was a problem hiding this comment.
You can use .json() on a returned Response
| for imdb_id in imdb_ids: | ||
| imdb_type = '' | ||
| entry = Entry() | ||
| if not ('title' in json_data[imdb_id] and |
There was a problem hiding this comment.
I deliberately added the parentheses here in order to be able to break the long if statement into several lines (as suggested by cvium).
Also: AFAIK, I would have to change this to if not ... or not ... or not ... : in order to remove the parentheses and still get the same result. Therefore, the parentheses save me several additional negations.
For these reasons, I would like to keep the parentheses if that is o.k.
| log.debug('First entry (imdb id: %s) looks like this: %s', imdb_ids[0], json_data[imdb_ids[0]]) | ||
| entries = [] | ||
| for imdb_id in imdb_ids: | ||
| imdb_type = '' |
There was a problem hiding this comment.
Does it makes sense to init this to an empty string?
|
Thanks, @liiight for the thorough code review! |
Motivation for changes:
imdb_watchlist plugin did not return any entries any more. See issue #2035 .
Detailed changes:
Addressed issues:
Log and/or tests output (preferably both):
new JSON parser:
updated html parser:
To Do: