Skip to content

Handling of "misnested" formatting tags does not match standard HTML behavior #1075

@vassudanagunta

Description

@vassudanagunta

Given the following input:

plain <b>bold <i>italic bold </b>italic </i>plain

htmlparser2 generates events equivalent to the following:

plain <b>bold <i>italic bold </i></b>italic plain

whereas the HTML5 spec (and de facto HTML behavior pre HTML5) interprets it as the following:

plain <b>bold <i>italic bold </i></b><i>italic </i>plain

You can confirm this behavior by opening the attached file in your browser and then looking at the rendered results as well as inspecting the DOM. This is also specified by the HTML Living Standard: 13.2.10.1 Misnested tags: <b><i></b></i>. See also https://stackoverflow.com/a/8766163/8910547

expected behavior

The current behavior with the following changes:

  1. Generate an implied <i> open tag event between the </b> close tag and "italic " text events.
  2. Do NOT skip the </i> close event between the "italic " and "plain" text events.

Metadata

Metadata

Assignees

No one assigned

    Labels

    wontfixOut of scope for the project

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions