[added] new search plugin for private tracker torrentday#1597
[added] new search plugin for private tracker torrentday#1597liiight merged 8 commits intoFlexget:developfrom zosky:zosky-td-1
Conversation
i used torrentleech as a starting point. The main difference being that TL uses uname/pass to login, generate the cookies & access the search pages. TD's login page has captcha, so instead i put the 3 cookies it needs as required keys. this should be fine, because in my browser they all have expiry date of 2038. Beyond that, the 2 sites have a slightly different CSS so i'm look for some different classes and divs (this is my first PR ever, so plz be gentle)
sorted out tabs and spaces
flexget/plugins/sites/torrentday.py
Outdated
| if 'url' not in entry: | ||
| log.error("Didn't actually get a URL...") | ||
| else: | ||
| log.debug("Got the URL: %s" % entry['url']) |
There was a problem hiding this comment.
You can pass the args to the logger and let it do the string formatting ie. comma instead of %
cvium
left a comment
There was a problem hiding this comment.
Pass the arguments to the logger instead of doing explicit string formatting and handle the requests exceptions. Seems fine otherwise.
flexget/plugins/sites/torrentday.py
Outdated
| cookies["pass"] = config['passkey'] | ||
| cookies["__cfduid"] = config['cfduid'] | ||
|
|
||
| page = requests.get(url, cookies=cookies).content |
There was a problem hiding this comment.
You need more exception handling
exception handling & better debug logging
flexget/plugins/sites/torrentday.py
Outdated
| try: | ||
| page = requests.get(url, cookies=cookies).content | ||
| except RequestException as e: | ||
| raise PluginError('Could not connect to torrentday: %s', str(e)) |
There was a problem hiding this comment.
PluginError only takes one argument. You have to do the string formatting here.
flexget/plugins/sites/torrentday.py
Outdated
|
|
||
| if not isinstance(config, dict): | ||
| config = {} | ||
| # sort = SORT.get(config.get('sort_by', 'seeds')) |
There was a problem hiding this comment.
You should remove any useless comments
flexget/plugins/sites/torrentday.py
Outdated
| # find the torrent names | ||
| title = tr.find("a", { "class": "torrentName" }) | ||
| entry['title'] = title.contents[0] | ||
| log.debug('title: %s' % title.contents[0]) |
flexget/plugins/sites/torrentday.py
Outdated
|
|
||
| # construct download URL | ||
| torrent_url = ( "https://www.torrentday.com/" + torrent_url + '?torrent_pass=' + config['rss_key'] ) | ||
| log.debug('RSS-ified download link: %s' % torrent_url) |
flexget/plugins/sites/torrentday.py
Outdated
| # urllib.quote will crash if the unicode string has non ascii characters, so encode in utf-8 beforehand | ||
| url = ('https://www.torrentday.com/browse.php?search=' + | ||
| quote(query.encode('utf-8')) + filter_url) | ||
| log.debug('Using %s as torrentday search url' % url) |
flexget/plugins/sites/torrentday.py
Outdated
| try: | ||
| page = requests.get(url, cookies=cookies).content | ||
| except RequestException as e: | ||
| raise PluginError('Could not connect to torrentday') |
There was a problem hiding this comment.
You could've changed it to raise PluginError('Could not connect to torrentday: %s' % e)
flexget/plugins/sites/torrentday.py
Outdated
| Search for name from torrentday. | ||
| """ | ||
|
|
||
| if not isinstance(config, dict): |
There was a problem hiding this comment.
No need for this, your schema means config cannot be anything other than a dict
flexget/plugins/sites/torrentday.py
Outdated
| categories = [categories] | ||
| # If there are any text categories, turn them into their id number | ||
| categories = [c if isinstance(c, int) else CATEGORIES[c] for c in categories] | ||
| filter_url = '&cata=yes&c%s=1&clear-new=1' % ','.join(str(c) for c in categories) |
There was a problem hiding this comment.
Not mandatory, but I prefer passing URL params as a dict to requests, makes it more readable:
params = { 'cata': 'yes', 'c%s' % ','.join(str(c) for c in categories): 1, 'clear-new': 1}Then add it with params=params in the requests call. Just a suggestion
use params rather than putting it all in the url also removed check for 'config is dict' not necessary, schema mandates it and fixed crash in scraping seed/leech by stripping number formatting
|
Is it possible to get cookie data by logging into the site, grabbing them from cookies manually sucks .. |
|
Looks fine to me besides cumbersome cookie usage. |
|
i dont like it either, but it works. Their login page has reCaptha so i cant go through the front door & catch their cookies. any suggestions ? |
|
One final change I'd like to see is cleaning up your inconsistent use of quotes regarding strings. Sometimes you use double quotes, other times you use single quotes. It has to be one or the other. Single quotes would probably be more in line with the rest of the code. |
as requested to match the rest of the project
i used torrentleech as a starting point. The main difference being that TL uses uname/pass to login, generate the cookies & access the search pages. TD's login page has captcha, so instead i put the 3 cookies it needs as required keys. this should be fine, because in my browser they all have expiry date of 2038. Beyond that, the 2 sites have a slightly different CSS so i'm look for some different classes and divs (this is my first PR ever, so plz be gentle)
Motivation for changes:
TD is my primary private tracker and TL secondary
i'd like to discover stuff. maybe others will too
Detailed changes:
Config usage if relevant (new plugin or updated schema):
Log and/or tests output (preferably both):
https://dl.dropboxusercontent.com/u/28529352/flexget-torrentday-test.log