Stricter HTTP/1.1 chunk extension parsing by chrisvest · Pull Request #16537 · netty/netty

chrisvest · 2026-03-24T17:50:01Z

Motivation:
Chunk extensions can include quoted string values, which themselves can include linebreaks and escapes for quotes. We need to parse these properly to ensure we find the correct start of the chunk data.

Modification:

Implement full RFC 9112 HTTP/1.1 compliant parsing of chunk start lines.
Add test cases from the Funky Chunks research: https://w4ke.info/2025/10/29/funky-chunks-2.html
This inclues chunk extensions with quoted strings that have linebreaks in them, and quoted strings that use escape codes.
Remove a test case that asserted support for control characters in the middle of chunk start lines, including after a naked chunk length field. Such control characters are not permitted by the standard.

Result:
Prevents HTTP message smuggling through carefully crafted chunk extensions.

Revert the ByteProcessor changes
Add a benchmark for HTTP/1.1 chunk decoding
Fix chunk initial line decoding

The initial line was not correctly truncated at its line break and ended up including some of the chunk contents.

Failing to parse chunk size must throw NumberFormatException
Line breaks are completely disallowed within chunk extensions

Change the chunk parsing back to its original code, because we know that line breaks are not supposed to occur within chunk extensions at all. This means doing the SWAR search should be suitable.

Modify the byte processor and add it as a validation step of the parsed chunk start line.

Update the tests to match.

Fix checkstyle

(cherry picked from commit 3b76df1)

* Stricter HTTP/1.1 chunk extension parsing Motivation: Chunk extensions can include quoted string values, which themselves can include linebreaks and escapes for quotes. We need to parse these properly to ensure we find the correct start of the chunk data. Modification: - Implement full RFC 9112 HTTP/1.1 compliant parsing of chunk start lines. - Add test cases from the Funky Chunks research: https://w4ke.info/2025/10/29/funky-chunks-2.html - This inclues chunk extensions with quoted strings that have linebreaks in them, and quoted strings that use escape codes. - Remove a test case that asserted support for control characters in the middle of chunk start lines, including after a naked chunk length field. Such control characters are not permitted by the standard. Result: Prevents HTTP message smuggling through carefully crafted chunk extensions. * Revert the ByteProcessor changes * Add a benchmark for HTTP/1.1 chunk decoding * Fix chunk initial line decoding The initial line was not correctly truncated at its line break and ended up including some of the chunk contents. * Failing to parse chunk size must throw NumberFormatException * Line breaks are completely disallowed within chunk extensions Change the chunk parsing back to its original code, because we know that line breaks are not supposed to occur within chunk extensions at all. This means doing the SWAR search should be suitable. Modify the byte processor and add it as a validation step of the parsed chunk start line. Update the tests to match. * Fix checkstyle (cherry picked from commit 3b76df1)

franz1981 · 2026-03-24T19:31:31Z

Sorry for been late @chrisvest
Here a suggestion to move to a single table lookup..
The per-byte Match[] scan has two costs: iterating N matchers per byte, and a double dependent load (match.get(value) → STATES_BY_ORDINAL[next]) where each load can't start until the previous resolves — worst case 3–5 serialised dependent loads on states with multiple matchers.

Both can be eliminated by flattening all transitions into a single byte[N_STATES * 256] and tracking state as a plain int. Hot path: one shift, one OR, one array load — no loop, no second dependent load. Footprint: 7 × 256 = 1,792 bytes in a single contiguous array vs ~1,320 bytes scattered across 30 heap objects (15 Match + 15 long[] backing arrays) with pointer chasing.

There is a correctness fix bundled here: the current matchers in ChunkExtValStart and ChunkExtValToken overlap on specific bytes (" at 0x22 and ; at 0x3B respectively). The original code is safe because it is first-match-wins. A naïve table compilation would be last-write-wins and silently produce wrong transitions for those bytes. The fix is to make the matchers explicitly disjoint by adding '"' and ';' to the relevant exclusion lists, and to enforce disjointness unconditionally at build time:

private static final byte[] TRANSITIONS = compile();

static byte[] compile() {
    Match[][] stateMatchers = new Match[N_STATES][];

    stateMatchers[SIZE] = new Match[]{
            new Match(SIZE).chars("0123456789abcdefABCDEF \t"),
            new Match(CHUNK_EXT_NAME).chars(";")
    };
    stateMatchers[CHUNK_EXT_NAME] = new Match[]{
            new Match(CHUNK_EXT_NAME)
                    .range(0x21, 0x7E)
                    .chars(" \t")
                    .chars("(),/:<=>?@[\\]{}", false),
            new Match(CHUNK_EXT_VAL_START).chars("=")
    };
    stateMatchers[CHUNK_EXT_VAL_START] = new Match[]{
            new Match(CHUNK_EXT_VAL_START).chars(" \t"),
            new Match(CHUNK_EXT_VAL_QUOTED).chars("\""),
            // explicitly exclude '"' (0x22) — disjoint with the matcher above
            new Match(CHUNK_EXT_VAL_TOKEN)
                    .range(0x21, 0x7E)
                    .chars("(),/:<=>?@[\\]{}\"", false)
    };
    stateMatchers[CHUNK_EXT_VAL_QUOTED] = new Match[]{
            new Match(CHUNK_EXT_VAL_QUOTED_ESCAPE).chars("\\"),
            new Match(CHUNK_EXT_VAL_QUOTED_END).chars("\""),
            // 0x23-0x5B excludes '"' (0x22), 0x5D-0x7E excludes '\' (0x5C) — disjoint by range
            new Match(CHUNK_EXT_VAL_QUOTED)
                    .chars("\t !")
                    .range(0x23, 0x5B)
                    .range(0x5D, 0x7E)
                    .range(0x80, 0xFF)
    };
    stateMatchers[CHUNK_EXT_VAL_QUOTED_ESCAPE] = new Match[]{
            new Match(CHUNK_EXT_VAL_QUOTED)
                    .chars("\t ")
                    .range(0x21, 0x7E)
                    .range(0x80, 0xFF)
    };
    stateMatchers[CHUNK_EXT_VAL_QUOTED_END] = new Match[]{
            new Match(CHUNK_EXT_VAL_QUOTED_END).chars("\t "),
            new Match(CHUNK_EXT_NAME).chars(";")
    };
    stateMatchers[CHUNK_EXT_VAL_TOKEN] = new Match[]{
            // explicitly exclude ';' (0x3B) — disjoint with the matcher below
            new Match(CHUNK_EXT_VAL_TOKEN)
                    .range(0x21, 0x7E)
                    .chars("(),/:<=>?@[\\]{};", false),
            new Match(CHUNK_EXT_NAME).chars(";")
    };

    byte[] table = new byte[N_STATES * 256];
    Arrays.fill(table, (byte) -1);
    for (int state = 0; state < N_STATES; state++) {
        int base = state << 8;
        for (Match m : stateMatchers[state]) {
            for (int i = m.nextSetBit(0); i >= 0; i = m.nextSetBit(i + 1)) {
                if (table[base | i] != -1) {
                    throw new IllegalStateException(
                            "overlapping matchers at byte 0x" + Integer.toHexString(i) +
                            " in state " + state);
                }
                table[base | i] = (byte) m.then;
            }
        }
    }
    return table;
}

private int state = SIZE;

@Override
public boolean process(byte value) {
    int next = TRANSITIONS[state << 8 | (value & 0xFF)];
    if (next < 0) {
        if (state == SIZE) {
            throw new NumberFormatException("Invalid chunk size");
        }
        throw new InvalidChunkExtensionException("Invalid chunk extension");
    }
    state = next;
    return true;
}

& 0xFF on value is required since byte is signed and used as an array index.
Match objects and stateMatchers are GC'd after static init — zero runtime footprint.
The IllegalStateException in compile() is unconditional (not an assert) so overlap bugs are caught at class load regardless of JVM flags.

chrisvest · 2026-03-24T20:12:03Z

@franz1981 Let's look at that in a follow-up PR. I'm in the middle of cutting a release, and will have to include this one as is.

chrisvest added this to the 4.1.132.Final milestone Mar 24, 2026

chrisvest merged commit 60e53c9 into netty:4.1 Mar 24, 2026
17 of 18 checks passed

chrisvest deleted the 4.1-merge branch March 24, 2026 20:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stricter HTTP/1.1 chunk extension parsing#16537

Stricter HTTP/1.1 chunk extension parsing#16537
chrisvest merged 1 commit into
netty:4.1from
chrisvest:4.1-merge

chrisvest commented Mar 24, 2026

Uh oh!

franz1981 commented Mar 24, 2026 •

edited

Loading

Uh oh!

chrisvest commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

chrisvest commented Mar 24, 2026

Uh oh!

franz1981 commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chrisvest commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

franz1981 commented Mar 24, 2026 •

edited

Loading