Skip to content

Conversation

@blueset
Copy link
Contributor

@blueset blueset commented Jan 11, 2021

Previously, PTB is producing incorrect results when formatting a message into the Markdown v2 format (Message.(text|caption)_markdown_v2). Specifically, when there is any of the need-to-escape characters _*~`>#+-=|{}.! appear in a nested entities, the last few characters was repeated at the end of the range.

For example

Message What PTB gives Expected result
<b><hashtag>#boldhashtag</hashtag></b> *#boldhashtag*g *#boldhashtag*
<b><i>a{b+c}d</i></b> *_a\{b\+c\}d_\\}d* *_a\{b\+c\}d_*

* <hashtag /> is used here just to represent the hashtag entity, Telegram does not support such a tag.

The issue is suspected to be caused by the used of escaped text sequence for nested entities, which is causing a discrepancy in the length of text due to the introduction of \.

This PR fixes the issue by using the original text (orig_text) as a basis for nested entity parsing. It also includes updated tests to address this issue.

@blueset blueset changed the title Fix: overly escape and offset error markdown v2 symbols when nested Fix: overly escape and offset error for markdown v2 symbols when nested Jan 11, 2021
@Bibo-Joshi
Copy link
Member

Hi, thanks for the finding and the PR! On a closer look, the same problem happens with HTML formatting (replace a{b+c} with a<b+c}, where the < needs to be escaped in HTML). Would you mind applying the fix to _parse_html and adjusting the tests accordingly?

Signed-off-by: Eana Hufwe <ilove@1a23.com>
@blueset
Copy link
Contributor Author

blueset commented Jan 12, 2021

@Bibo-Joshi, nice spot! I just added the same fix to the HTML parser as well. For the tests, I included the > symbol in the message text, which is the only one required to be escaped in both format.

@blueset blueset changed the title Fix: overly escape and offset error for markdown v2 symbols when nested Fix: overly escape and offset error for markdown v2 and HTML symbols when nested Jan 12, 2021
@Bibo-Joshi
Copy link
Member

Perfect, thanks for the contribution :)

@Bibo-Joshi Bibo-Joshi merged commit be54cf4 into python-telegram-bot:master Jan 12, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Jan 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants