Auto-port 5.0: Pin multipart Content-Type / Content-Transfer-Encoding case folding to Locale.US#16782
Merged
Merged
Conversation
…o Locale.US (#16768) ## Problem Two spots in the HTTP multipart codec normalize a header value with `String.toLowerCase()` without an explicit Locale, then compare the result to lowercase ASCII constants. The JVM default Locale governs that call, and on a Turkish (`tr_TR`) JVM the ASCII `'I'` lowercases to `'ı'` (U+0131). A perfectly valid uppercase form silently misses the comparison. Concretely on a Turkish-locale JVM: - `HttpPostMultipartRequestDecoder` decoding a part with `Content-Transfer-Encoding: BINARY` (uppercase, RFC 2045 §6.1 says mechanism tokens are case-insensitive) lowercases the value to `"bınary"`, fails the equality check against the `binary` / `7bit` / `8bit` constants, and throws `ErrorDataDecoderException: TransferEncoding Unknown: bınary`. - `HttpPostRequestEncoder.finalizeRequest()` skipping a pre-existing `Content-Type: MULTIPART/form-data; boundary=...` header (mixed case is allowed by the Content-Type ABNF, RFC 7231 §3.1.1.1) lowercases the value to `"multıpart/..."`, misses the `startsWith("multipart/form-data")` check, and the original mixed-case header survives alongside the freshly-built multipart Content-Type the encoder is about to add. The outgoing request ends up with **two** `Content-Type` headers. The values being compared against are lowercase ASCII tokens and have no business consulting the JVM locale. ## Fix Pin both `String.toLowerCase()` calls to `Locale.US`: - `codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostMultipartRequestDecoder.java` — `getFileUpload(String delimiter)` lowercases the `Content-Transfer-Encoding` value before matching it against `BIT7` / `BIT8` / `BINARY` mechanism tokens. - `codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostRequestEncoder.java` — `finalizeRequest()` lowercases each existing `Content-Type` before checking if it is the multipart or `application/x-www-form-urlencoded` form (which the encoder is responsible for setting fresh). ## Tests | Change Point | Test | |--------------|------| | Decoder Content-Transfer-Encoding lowercase under Turkish | `HttpPostMultiPartRequestDecoderTest.testUppercaseBinaryTransferEncodingUnderTurkishLocale` — installs the Turkish locale (restored in `finally`), feeds a multipart body with `Content-Transfer-Encoding: BINARY`, and asserts the file part decodes successfully. Without the fix this throws `TransferEncoding Unknown: bınary`. | | Encoder Content-Type lowercase under Turkish | `HttpPostRequestEncoderTest.testFinalizeRemovesPreexistingMultipartContentTypeUnderTurkishLocale` — pre-sets `Content-Type: MULTIPART/form-data; boundary=preexisting`, runs the encoder under the Turkish locale, and asserts only one Content-Type header survives. Without the fix the request ends up with two Content-Type headers. | Both tests fail deterministically on the unfixed code (one as an assertion failure, the other as a thrown `ErrorDataDecoderException`) and pass after the `Locale.US` change. Verification: ``` mvn -pl codec-http -am test \ -Dtest='HttpPostRequestEncoderTest#testFinalizeRemovesPreexistingMultipartContentTypeUnderTurkishLocale,HttpPostMultiPartRequestDecoderTest#testUppercaseBinaryTransferEncodingUnderTurkishLocale' \ -Dsurefire.failIfNoSpecifiedTests=false # Tests run: 2, Failures: 0, Errors: 0, Skipped: 0 ``` ## Impact - Behavior on en/CJK/most locales: unchanged (the previous default-locale lowercase already produced ASCII). - Behavior on Turkish (and other locales with non-ASCII case mappings): a multipart upload with uppercase `Content-Transfer-Encoding: BINARY` now decodes instead of throwing; an outgoing multipart request with a pre-existing mixed-case `Content-Type: MULTIPART/...` no longer leaks a duplicate Content-Type header. - API: no signature change. All callers continue through the existing public API. - Risk: bounded — `Locale.US` is the standard Netty pattern for protocol-string normalization (see e.g. `WebSocketClientHandshaker.java:761`). Sibling fix to #16765, which pinned the same locale gotcha in `HttpVersion`, `RtspMethods`, and `RtspVersions`. --------- Co-authored-by: Norman Maurer <norman_maurer@apple.com> (cherry picked from commit b9eacea)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Auto-port of #16768 to 5.0
Cherry-picked commit: b9eacea
Problem
Two spots in the HTTP multipart codec normalize a header value with
String.toLowerCase()without an explicit Locale, then compare the result to lowercase ASCII constants. The JVM default Locale governs that call, and on a Turkish (tr_TR) JVM the ASCII'I'lowercases to'ı'(U+0131). A perfectly valid uppercase form silently misses the comparison.Concretely on a Turkish-locale JVM:
HttpPostMultipartRequestDecoderdecoding a part withContent-Transfer-Encoding: BINARY(uppercase, RFC 2045 §6.1 says mechanism tokens are case-insensitive) lowercases the value to"bınary", fails the equality check against thebinary/7bit/8bitconstants, and throwsErrorDataDecoderException: TransferEncoding Unknown: bınary.HttpPostRequestEncoder.finalizeRequest()skipping a pre-existingContent-Type: MULTIPART/form-data; boundary=...header (mixed case is allowed by the Content-Type ABNF, RFC 7231 §3.1.1.1) lowercases the value to"multıpart/...", misses thestartsWith("multipart/form-data")check, and the original mixed-case header survives alongside the freshly-built multipart Content-Type the encoder is about to add. The outgoing request ends up with twoContent-Typeheaders.The values being compared against are lowercase ASCII tokens and have no business consulting the JVM locale.
Fix
Pin both
String.toLowerCase()calls toLocale.US:codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostMultipartRequestDecoder.java—getFileUpload(String delimiter)lowercases theContent-Transfer-Encodingvalue before matching it againstBIT7/BIT8/BINARYmechanism tokens.codec-http/src/main/java/io/netty/handler/codec/http/multipart/HttpPostRequestEncoder.java—finalizeRequest()lowercases each existingContent-Typebefore checking if it is the multipart orapplication/x-www-form-urlencodedform (which the encoder is responsible for setting fresh).Tests
HttpPostMultiPartRequestDecoderTest.testUppercaseBinaryTransferEncodingUnderTurkishLocale— installs the Turkish locale (restored infinally), feeds a multipart body withContent-Transfer-Encoding: BINARY, and asserts the file part decodes successfully. Without the fix this throwsTransferEncoding Unknown: bınary.HttpPostRequestEncoderTest.testFinalizeRemovesPreexistingMultipartContentTypeUnderTurkishLocale— pre-setsContent-Type: MULTIPART/form-data; boundary=preexisting, runs the encoder under the Turkish locale, and asserts only one Content-Type header survives. Without the fix the request ends up with two Content-Type headers.Both tests fail deterministically on the unfixed code (one as an assertion failure, the other as a thrown
ErrorDataDecoderException) and pass after theLocale.USchange.Verification:
Impact
Content-Transfer-Encoding: BINARYnow decodes instead of throwing; an outgoing multipart request with a pre-existing mixed-caseContent-Type: MULTIPART/...no longer leaks a duplicate Content-Type header.Locale.USis the standard Netty pattern for protocol-string normalization (see e.g.WebSocketClientHandshaker.java:761).Sibling fix to #16765, which pinned the same locale gotcha in
HttpVersion,RtspMethods, andRtspVersions.