Skip to content

Added urlrewrite plugin for Rlsbb#1884

Merged
liiight merged 2 commits intoFlexget:developfrom
thawn:rlsbb_urlrewrite
Jun 24, 2017
Merged

Added urlrewrite plugin for Rlsbb#1884
liiight merged 2 commits intoFlexget:developfrom
thawn:rlsbb_urlrewrite

Conversation

@thawn
Copy link
Copy Markdown
Contributor

@thawn thawn commented Jun 21, 2017

Motivation for changes:

rlsbb does not provide all possible download links (notably rapidgator links are missing) in their rss feed.
This urlrewrite plugin grabs additional links from the rlsbb.ru website and adds them to the urls field of the entry.

This plugin works similar to the rmz urlrewrite plugin which was just pulled recently:
#1879

Detailed changes:

rlsbb.ru urlrewriter
Version 0.1

Configuration

rlsbb:
  filehosters_re:
    - domain\.com
    - domain2\.org
  link_text_re:
    - UPLOADGiG
    - NiTROFLARE
    - RAPiDGATOR
  parse_comments: no

filehosters_re: Only add links that match any of the regular expressions 
  listed under filehosters_re.
link_text_re: search for <a> tags where the text (not the url) matches 
  one of the given regular expressions. The href property of these <a> tags
  will be used as the url (or urls).
parse_comments: whether the plugin should also parse the comments or only 
  the main post. Note that it is highly recommended to use filehosters_re
  if you enable parse_comments. Otherwise, the plugin may return too
  many and even some potentially dangerous links. 

If more than one valid link is found, the url of the entry is rewritten to
the first link found. The complete list of valid links is placed in the
'urls' field of the entry.

Therefore, it is recommended, that you configure your output to use the
'urls' field instead of the 'url' field.

For example, to use jdownloader 2 as output, you would use the exec plugin:
exec:
  - echo "text={{urls}}" >> "/path/to/jd2/folderwatch/{{title}}.crawljob"

Config usage if relevant (new plugin or updated schema):

    rlsbb:
      filehosters_re:
        - domain\.com
        - domain2\.org
      link_text_re:
        - UPLOADGiG
        - NiTROFLARE
        - RAPiDGATOR
      parse_comments: no

Log and/or tests output (preferably both):

DEBUG  rlsbb  TASK  Searching <entry url> for a tags where the text matches one of: <link_text_re>
DEBUG  rlsbb  TASK  Found link elements: <html <a> elements>
DEBUG  rlsbb  TASK  Comment parsing enabled: found <number> comments
WARNING  rlsbb  TASK  You have enabled comment parsing but you did not define any filehoster_re filter. You may get a lot of unwanted and potentially dangerous links from the comments.
DEBUG  rlsbb  TASK  Using filehosters_re filters: <list of filter expressions>
DEBUG  rlsbb  TASK  Url: "<link found at entry url>" matched filter: <expression>

VERBOSE  rlsbb  TASK  Found <number> links at <entry url>

To Do:

  • Add ability to work as a search plugin.


@event('plugin.register')
def register_plugin():
plugin.register(UrlRewriteRlsbb, 'rlsbb', interfaces=[
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this line messing pep8? Doesn't seen that long to me

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this was a relic from before I set pep8 line width to 120. Fixed it now.

@liiight liiight merged commit 2266d71 into Flexget:develop Jun 24, 2017
@thawn
Copy link
Copy Markdown
Contributor Author

thawn commented Jun 25, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants