Skip to content

Coerce title to unicode type in TorrentLeech search plugin#1302

Closed
danielkza wants to merge 1 commit intoFlexget:developfrom
danielkza:torrentleech-search-coerce-title
Closed

Coerce title to unicode type in TorrentLeech search plugin#1302
danielkza wants to merge 1 commit intoFlexget:developfrom
danielkza:torrentleech-search-coerce-title

Conversation

@danielkza
Copy link
Copy Markdown
Contributor

Pulling it directly from the BeautifulSoup model will cause the whole
document tree to be stored in the Entry. It can cause excessive
recursion errors during serialization, and is not necessary at all.

Fixes #1301.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

@danielkza danielkza Jul 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Since future.builtins.* is already imported, it should just be a matter of using str() instead, correct?

Pulling it directly from the BeautifulSoup model will cause the whole
document tree to be stored in the Entry. It can cause excessive
recursion errors during serialization, and is not necessary at all.

Fixes Flexget#1301.
@danielkza danielkza force-pushed the torrentleech-search-coerce-title branch from cec51bf to 540e088 Compare July 29, 2016 07:47
@paranoidi
Copy link
Copy Markdown
Member

use:

link.contents[0].text

@danielkza
Copy link
Copy Markdown
Contributor Author

danielkza commented Jul 29, 2016

Looking at the BS4 source code, it doesn't seem like .text provides any guarantee that it will return a plain string. NavigableString is itself an instance of unicode (str in Py3) mixed with the element trait, and the BS4 documentation recommends calling unicode():

If you want to use a NavigableString outside of Beautiful Soup, you should call unicode() on it to turn it into a normal Python Unicode string. If you don’t, your string will carry around a reference to the entire Beautiful Soup parse tree, even when you’re done using Beautiful Soup. This is a big waste of memory.

@paranoidi
Copy link
Copy Markdown
Member

Hmm, I see ... then we have a few other places where the same fix needs to be done.

@gazpachoking
Copy link
Copy Markdown
Member

gazpachoking commented Aug 3, 2016

Hmm, I think we used to cast all strings to unicode when setting on entry and lost a bit of that on the py3 upgrade. Perhaps we should add something here to solve this for all cases?

raise EntryUnicodeError(key, value)

Maybe

elif isinstance(value, str) and type(value) not in (str, text_type):
    # Cast string subclasses (e.g. NavigableString) back to regular strings
    value = str(value)

text_type is from future.utils

@liiight
Copy link
Copy Markdown
Member

liiight commented Aug 3, 2016

I'm with you @gazpachoking, that seems like a more correct approach

@cvium
Copy link
Copy Markdown
Contributor

cvium commented Aug 8, 2016

Any progress on this?

@danielkza
Copy link
Copy Markdown
Contributor Author

I don't know enough about flexget and how DB interactions are handled to
know the best place for a global fix. I can try to work it with some help
on where to look :)

Em seg, 8 de ago de 2016 10:55, Claus Vium notifications@github.com
escreveu:

Any progress on this?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1302 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAtnTrBzUwAYTjFDs7v3yYZJXkLoTHwQks5qdzVVgaJpZM4JX7y5
.

@cvium
Copy link
Copy Markdown
Contributor

cvium commented Aug 8, 2016

Well, @gazpachoking proposed a fix. Try it.

@danielkza
Copy link
Copy Markdown
Contributor Author

I somehow missed his comment, my bad.

Em seg, 8 de ago de 2016 11:04, Claus Vium notifications@github.com
escreveu:

Well, @gazpachoking https://github.com/gazpachoking proposed a fix. Try
it.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1302 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAtnTpnE72_wrRS2Rgfp7I-X2b7RC2M3ks5qdzbZgaJpZM4JX7y5
.

@danielkza
Copy link
Copy Markdown
Contributor Author

Closed in favor of generic solution in #1337.

@danielkza danielkza closed this Aug 15, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants