Add option to escape unescaped ampersands#2218
Add option to escape unescaped ampersands#2218gazpachoking merged 18 commits intoFlexget:developfrom asm0dey:patch-1
Conversation
Right now some trackers (e.g. newstudio) do not eascape ampersands with `&` This commits adds logics to fix that. Option is disabled by default because it adds some processing overhead and needs to go thru whole XML content and then rebuild new XML.
flexget/plugins/input/rss.py
Outdated
| elif not in_cdata_block and char == '<' and content[idx:idx+9] == '<![CDATA[': | ||
| in_cdata_block = True | ||
| elif in_cdata_block and char == ']' and content[idx:idx+3] == ']]>': | ||
| should_change = False |
There was a problem hiding this comment.
This doesn't really do anything (you probably forgot to change it). You probably want to do it like this instead, since it becomes a little more complex to skip ahead 3 characters while enumerating:
elif in_cdata_block and char == '>' and content[idx-2:idx+1] == ']]>':
in_cdata_block= False
There was a problem hiding this comment.
Thank you for comment and finding performance issue
flexget/plugins/input/rss.py
Outdated
| char = '&' | ||
| elif not in_cdata_block and char == '<' and content[idx:idx+9] == '<![CDATA[': | ||
| in_cdata_block = True | ||
| elif in_cdata_block and char == ']' and content[idx-2:idx+1] == ']]>': |
There was a problem hiding this comment.
Maybe it's better to test on ']' because it can be seen in XML not such often as '>'?
There was a problem hiding this comment.
You're already inside a CDATA block though.
There was a problem hiding this comment.
But it's common practice to put XML inside CDATA
|
It would be nice to have a couple of tests though. |
|
I would love to write some, but have no idea how to do it |
|
I can write tests in naive form, but don't know how to write them in python we should test that And test should work for those strings, which are used inside flexget |
|
I do not understand ho they work :( How do they get content on localhost? |
|
The relative path points to https://github.com/Flexget/Flexget/blob/develop/flexget/tests/rss.xml |
|
You can just create a new test XML-file and write a few tests in the same vein as the existing ones. |
gazpachoking
left a comment
There was a problem hiding this comment.
You should have more luck if you deal only with bytes in this function. Left some notes on how to accomplish that.
|
@gazpachoking is there anything else I can do to get this commit merged? |
Motivation for changes:
Right now some trackers (e.g. newstudio) do not eascape ampersands with
&Detailed changes:
This commits adds logics to fix that. Option is disabled by default because it adds some processing overhead and needs to go thru whole XML content and then rebuild new XML.
Config usage if relevant (new plugin or updated schema):