I noticed a parsing difference when it came to iframe child content between 1.19.1 to 1.20.1. I don't know which behavior is correct.
Given this html fragment:
<blockquote>
<p>
Some content
<script>alert('script outside iframe, oh noes!');</script></p>
<iframe>Content inside iframe<script>alert('script between iframe, oh noes!');</script></iframe>
</blockquote>
When cleaned with String safeHtml = Jsoup.clean(unsafeHtml, Safelist.relaxed());, the following is the result:
1.19.1:
<blockquote>
<p>Some content</p>Content inside iframe <script>alert('script between iframe, oh noes!');</script>
</blockquote>
1.20.1:
<blockquote>
<p>Some content</p>
</blockquote>
I used git bisect and the commit where the behavior was introduced is 3704342, where the pretty printer was rewritten.
Is the iframe removal the right behavior?
Is there a way to replicate the old behavior, with the child content getting escaped?
Thank you for any help or insight.
I noticed a parsing difference when it came to iframe child content between 1.19.1 to 1.20.1. I don't know which behavior is correct.
Given this html fragment:
When cleaned with
String safeHtml = Jsoup.clean(unsafeHtml, Safelist.relaxed());, the following is the result:1.19.1:
1.20.1:
I used
git bisectand the commit where the behavior was introduced is 3704342, where the pretty printer was rewritten.Is the iframe removal the right behavior?
Is there a way to replicate the old behavior, with the child content getting escaped?
Thank you for any help or insight.