Skip to content

Fix parsing of bogus comments after end tags#507

Merged
jdm merged 1 commit intoservo:masterfrom
AcqRel:tokenizer-bugfix
Aug 26, 2023
Merged

Fix parsing of bogus comments after end tags#507
jdm merged 1 commit intoservo:masterfrom
AcqRel:tokenizer-bugfix

Conversation

@AcqRel
Copy link
Contributor

@AcqRel AcqRel commented Aug 25, 2023

This fixes a bug in the tokenizer where the tag name was included in a bogus comment after an appropriate end tag.

For example, this:

<style></style ><!a>

is incorrectly parsed as the following:

<style></style><!--stylea-->

instead of the expected:

<style></style><!--a-->

For this bug to trigger, the end tag needs to be parsed in one of the raw text states (RCDATA, RAWTEXT, or Script data) and have whitespace or a slash after the tag name. I don't know how the contents of the temporary buffer end up inside the comment, but clearing the temporary buffer when exiting the RawEndTagName state seems to be enough to fix it.

Copy link
Member

@jdm jdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable! Thank you!

@jdm jdm enabled auto-merge August 26, 2023 14:03
@jdm jdm added this pull request to the merge queue Aug 26, 2023
Merged via the queue into servo:master with commit aa11b3b Aug 26, 2023
@AcqRel AcqRel deleted the tokenizer-bugfix branch August 26, 2023 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants