Skip to content

File: reduce scope of whitespace trimming#919

Merged
jtojnar merged 3 commits intomasterfrom
wip/jtojnar/reduce-ws-mangling
Jul 29, 2025
Merged

File: reduce scope of whitespace trimming#919
jtojnar merged 3 commits intomasterfrom
wip/jtojnar/reduce-ws-mangling

Conversation

@jtojnar
Copy link
Member

@jtojnar jtojnar commented Jun 26, 2025

Trimming final whitespace would mangle UTF-16BE encoded files ending with whitespace. For example a new-line (\x00\x0A), would be turned into half a code-point \x00. Let’s just trim at the beginning of the file.

Similarly, trimming initial whitespace can be issue with UTF-16LE (\x0A\x00 would again get mangled into \x00). Let’s ensure that the whitespace is immediately followed by <.

The trim was introduced in #445 since XML declaration cannot be preceded by anything other than BOM, and XML declaration starts with <, so the requirement should be fine.

It might still mangle body of some weird-ass encoding but that is unlikely to be encountered nowadays.

Also add a regression test.

This is a follow-up to #917.

Depends on #918

@jtojnar jtojnar mentioned this pull request Jun 26, 2025
@jtojnar jtojnar added this to the 1.9.0 milestone Jun 26, 2025
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from 9d6c3c1 to a812eb3 Compare June 27, 2025 03:58
@jtojnar
Copy link
Member Author

jtojnar commented Jun 27, 2025

Also added a regression test.

@jtojnar
Copy link
Member Author

jtojnar commented Jun 27, 2025

By the way, I do not think File is correct place for this, as it won’t work with Psr18Client but that is for the future.

@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from 018b5ca to fcb82db Compare June 27, 2025 07:51
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jun 27, 2025
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from e2dc6d8 to da824c0 Compare June 27, 2025 08:29
@jtojnar jtojnar marked this pull request as draft June 27, 2025 08:30
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from da824c0 to 2e8011b Compare June 27, 2025 14:09
@jtojnar jtojnar marked this pull request as ready for review June 27, 2025 14:10
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from 2e8011b to ca01c0f Compare June 27, 2025 18:12
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from ca01c0f to aa15f55 Compare June 28, 2025 15:59
@jtojnar jtojnar marked this pull request as draft June 29, 2025 01:09
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch 2 times, most recently from 1c14ebd to ddd2a8d Compare June 30, 2025 19:46
@jtojnar jtojnar marked this pull request as ready for review June 30, 2025 19:46
jtojnar added a commit that referenced this pull request Jul 29, 2025
mf2 library uses implicit nullable types, which were deprecated in PHP 8.4:
https://wiki.php.net/rfc/deprecate-implicitly-nullable-types

This causes the response body of the HTTP mock server in tests to be polluted with the following warning:

    Deprecated: Mf2\Parser::parse(): Implicitly marking parameter $context as nullable is deprecated, the explicit nullable type must be used instead in vendor/mf2/mf2/Mf2/Parser.php on line 1373

This breaks tests in <#838> and <#919>.

Until the upstream fix (microformats/php-mf2#264) is merged, let’s uninstall mf2 in PHP 8.4 CI.

Works around: #921

Co-authored-by: Artur Weigandt <Art4@users.noreply.github.com>
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from ddd2a8d to cec0005 Compare July 29, 2025 07:06
@jtojnar jtojnar requested a review from Art4 July 29, 2025 07:09
jtojnar added 3 commits July 29, 2025 23:41
Trimming final whitespace would mangle UTF-16BE encoded files ending with whitespace. For example a new-line (`\x00\x0A`), would be turned into half a code-point `\x00`. Let’s just trim at the beginning of the file.

Similarly, trimming initial whitespace can be issue with UTF-16LE (`\x0A\x00` would again get mangled into `\x00`). Let’s ensure that the whitespace is immediately followed by `<`.

The trim was introduced in 989645e since XML declaration cannot be preceded by anything other than BOM, and XML declaration starts with `<`, so the requirement should be fine.

It might still mangle body of some weird-ass encoding but that is unlikely to be encountered nowadays.

This is a follow-up to 458d746.
It was introduced in 989645e and further refined in the parent commit.
@jtojnar jtojnar force-pushed the wip/jtojnar/reduce-ws-mangling branch from cec0005 to d38e63d Compare July 29, 2025 21:43
@jtojnar jtojnar merged commit d38e63d into master Jul 29, 2025
20 checks passed
@jtojnar jtojnar deleted the wip/jtojnar/reduce-ws-mangling branch July 29, 2025 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants