Skip to content

Failure to normalize documents with HTML comments after </body> #4282

@westonruter

Description

@westonruter

Bug Description

When markup appears after </body> or </html>, the document's original <body> is lost.
There is a bug in \Amp\AmpWP\Dom\Document::normalize_document_structure() in particular with the pattern in \Amp\AmpWP\Dom\Document::HTML_STRUCTURE_BODY_END_TAG as used here:

amp-wp/src/Dom/Document.php

Lines 492 to 497 in 2026afc

} elseif ( ! preg_match( self::HTML_STRUCTURE_BODY_END_TAG, $content, $matches ) ) {
// Only <body> missing.
// @todo This is an expensive regex operation, look into further optimization.
$content = preg_replace( self::HTML_STRUCTURE_HEAD_TAG, '$0<body>', $content, 1 );
$content .= '</body>';
}

This is causing new <body> to be added to the page which is blowing away the existing one.

Expected Behaviour

The original <body> with all of its valid attributes should not be removed during normalization when there is markup or HTML comments appearing after </body> or </html>. Any markup appearing after </body> should get moved inside during normalization, although beware of moving HTML comment nodes as this may cause problems with validation. This came up before previously in #4104.

Steps to reproduce

Modify the footer.php of Twenty Twenty to end with:

	</body>
</html>
<!-- a comment! -->

or

	</body>
<!-- a comment! -->
</html>

When then look at an AMP page in Standard/Transitional mode. Notice that the body element of the page is now just:

<body id="body">

Whereas it should be:

<body class="home blog logged-in admin-bar enable-search-modal has-no-pagination showing-comments show-avatars footer-top-visible customize-support" id="body">

Screenshots

Additional context

  • WordPress version:
  • Plugin version:
  • Gutenberg plugin version (if applicable):
  • AMP plugin template mode:
  • PHP version:
  • OS:
  • Browser: [e.g. chrome, safari]
  • Device: [e.g. iPhone6]

Do not alter or remove anything below. The following sections will be managed by moderators only.

Acceptance criteria

Implementation brief

QA testing instructions

Demo

Changelog entry

Metadata

Metadata

Assignees

Labels

BugSomething isn't workingP0High priority

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions