Skip to content

Enforce UTF-8 content-type header#4296

Merged
westonruter merged 2 commits intodevelopfrom
enforce-utf8-content-type-header
Feb 17, 2020
Merged

Enforce UTF-8 content-type header#4296
westonruter merged 2 commits intodevelopfrom
enforce-utf8-content-type-header

Conversation

@schlessera
Copy link
Copy Markdown
Collaborator

@schlessera schlessera commented Feb 15, 2020

Summary

We're already convert the encoding from non-UTF-8 to UTF-8 within the DOMDocument extension, but WordPress still outputs a content-type header with the old charset.

This PR fixes this, and also removes a trigger_error that is not needed anymore.

Fixes #855

Checklist

  • My pull request is addressing an open issue (please create one otherwise).
  • My code is tested and passes existing tests.
  • My code follows the Engineering Guidelines (updates are often made to the guidelines, check it out periodically).

@westonruter
Copy link
Copy Markdown
Member

I confirmed this worked by forcing a site to be created in latin1 and then creating a post with the table in https://cs.stanford.edu/people/miles/iso8859.html

The Latin1 characters rendered as expected in AMP both in Transitional and Reader modes, with the page itself being actually UTF-8 because of the AMP plugin.

@westonruter westonruter requested a review from kienstra February 16, 2020 01:06
Copy link
Copy Markdown
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kienstra please double-check this using your testing instructions as well.

@westonruter westonruter added this to the v1.5 milestone Feb 16, 2020
@kienstra
Copy link
Copy Markdown
Contributor

kienstra commented Feb 17, 2020

Looks Good!

Hi @schlessera,
Nice, the AMP and non-AMP URLs now look the same, even when the non-AMP URL uses latin1.

Testing this PR with these instructions, they look the same:

Non-AMP

non-amp-version

AMP

amp-version

@schlessera
Copy link
Copy Markdown
Collaborator Author

Phew! The endless fiddling with DOMDocument bugs and encoding weirdnesses was successful after all!

@westonruter westonruter merged commit b8ccb97 into develop Feb 17, 2020
@westonruter westonruter deleted the enforce-utf8-content-type-header branch February 17, 2020 16:45
@westonruter westonruter changed the title Enfore UTF-8 content-type header Enforce UTF-8 content-type header Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support WordPress installs with non-UTF-8 charsets

4 participants