Skip to content

FlowControlHandler no longer honors setAutoRead(false) made by a downstream handler while dequeuing (regression in 4.2.15) #16945

@lhotari

Description

@lhotari

Expected behavior

When a downstream handler calls ctx.channel().config().setAutoRead(false) from inside its
channelRead method, FlowControlHandler should stop releasing further queued messages
immediately and hold them until auto-read is re-enabled or Channel#read() is called.

This is the behavior documented in the class javadoc of FlowControlHandler itself:

 * class MyExampleHandler extends ChannelInboundHandlerAdapter {
 *   @Override
 *   public void channelRead(ChannelHandlerContext ctx, Object msg) {
 *     if (msg instanceof HttpRequest) {
 *       ctx.channel().config().setAutoRead(false);
 *
 *       // The FlowControlHandler will hold any subsequent events that
 *       // were emitted by HttpObjectDecoder until auto reading is turned
 *       // back on or Channel#read() is being called.
 *     }
 *   }
 * }

Up to and including Netty 4.1.135.Final and 4.2.14.Final, this worked because the dequeue loop
re-checked config.isAutoRead() before releasing each queued message:

// 4.1.135.Final / 4.2.14.Final
private int dequeue(ChannelHandlerContext ctx, int minConsume) {
    int consumed = 0;
    while (queue != null && (consumed < minConsume || config.isAutoRead())) {
        Object msg = queue.poll();
        ...
        ctx.fireChannelRead(msg);
    }
    ...
}

If fireChannelRead(msg) caused a downstream handler to disable auto-read, the very next loop
iteration observed it and stopped, leaving the remaining messages queued.

Actual behavior

Since 4.2.15.Final (#16837, the 4.2 backport of #15053 "FlowControlHandler: Suppress duplicate
channelReadComplete after draining queue"), the auto-read decision is made once up front and the
dequeue loop no longer consults config.isAutoRead():

// 4.2.15.Final
@Override
public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
    ...
    queue.offer(msg);
    if (config.isAutoRead()) {
        dequeueAll(ctx);          // maxConsume = -1, unlimited
    } else if (unsatisfiedReads > 0) {
        dequeueOne(ctx);
        ...
    }
}

private int dequeue(ChannelHandlerContext ctx, int maxConsume) {
    int consumed = 0;
    while (queue != null && (consumed < maxConsume || maxConsume < 0)) {
        Object msg = queue.poll();
        ...
        ctx.fireChannelRead(msg);
    }
    ...
}

Once dequeueAll starts draining, a downstream handler disabling auto-read from inside
fireChannelRead is ignored — the entire queue is delivered anyway. This bites whenever the
queue holds more than one message, i.e. on resume: messages accumulate while auto-read is
disabled, and re-enabling auto-read (setAutoRead(true)read()dequeueAll) dumps the
whole queued backlog even if the downstream handler re-disables auto-read while processing the
first delivered message.

The same rewrite has since landed on the 4.1 branch as well (#16912, after 4.1.135.Final was
released), so the next 4.1.x release will have the same behavior.

Why this matters

The per-message auto-read responsiveness is the core value of FlowControlHandler for
server-side request throttling: a decoder (e.g. ByteToMessageDecoder) can emit many messages
from a single read, and setAutoRead(false) alone cannot stop those already-decoded messages —
that is exactly the gap FlowControlHandler is documented to close.

Concrete real-world impact: Apache Pulsar places FlowControlHandler directly before its
connection handler and pauses connections by calling setAutoRead(false) from inside
channelRead (publish rate limiting, max pending requests, memory limits). After upgrading
to Netty 4.2.15, Pulsar's rate limiters and back pressure regressed: throttled connections that
resumed delivered their entire queued backlog before the renewed setAutoRead(false) took
effect, allowing publish rates and inflight messages well beyond the configured limits.

Steps to reproduce

Run the JUnit test below (e.g. dropped into handler/src/test/java/io/netty/handler/flow/).
Validated against the netty/netty source tree:

Ref Result
netty-4.1.135.Final tag PASS
netty-4.2.14.Final tag PASS
4.2.15.Final (via dependency) FAIL
4.2 branch HEAD (7bae566a93) FAIL
4.1 branch HEAD (465c9e6e47, includes #16912) FAIL

The failures are:

org.opentest4j.AssertionFailedError: phase 2 (resume releases exactly one message) ==> expected: <2> but was: <5>

The scenario: messages accumulate in FlowControlHandler's queue while auto-read is disabled.
Auto-read is then re-enabled, and the downstream handler — still over its limit — disables it
again while processing the first released message. On 4.1.135/4.2.14 exactly one queued message
is released per resume; on 4.2.15 the entire queue is drained regardless.

Minimal yet complete reproducer code (or URL to code)

import static org.junit.jupiter.api.Assertions.assertEquals;
import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.ChannelInboundHandlerAdapter;
import io.netty.channel.embedded.EmbeddedChannel;
import io.netty.handler.flow.FlowControlHandler;
import java.util.ArrayList;
import java.util.List;
import org.junit.jupiter.api.Test;

public class FlowControlHandlerAutoReadTest {

    @Test
    public void autoReadDisabledDuringChannelReadStopsDelivery() {
        List<Object> received = new ArrayList<>();
        ChannelInboundHandlerAdapter throttlingHandler = new ChannelInboundHandlerAdapter() {
            @Override
            public void channelRead(ChannelHandlerContext ctx, Object msg) {
                received.add(msg);
                // pause the connection while "processing" each message,
                // like the example in FlowControlHandler's javadoc
                ctx.channel().config().setAutoRead(false);
            }
        };
        EmbeddedChannel channel = new EmbeddedChannel(new FlowControlHandler(), throttlingHandler);

        // The downstream handler disables auto-read while processing the first message;
        // messages 2..5 are held in FlowControlHandler's queue (works on all versions).
        channel.writeInbound("1", "2", "3", "4", "5");
        assertEquals(1, received.size());

        // Resume: the downstream handler is still over its limit and disables auto-read
        // again while processing the first released message.
        //
        // 4.1.135 / 4.2.14: passes - exactly one more queued message is released; the
        //                   dequeue loop observes the renewed setAutoRead(false) and stops.
        // 4.2.15:           fails with received.size() == 5 - dequeueAll() drains the whole
        //                   queue, ignoring the auto-read change made by the downstream
        //                   handler mid-drain.
        channel.config().setAutoRead(true);
        assertEquals(2, received.size());

        channel.finishAndReleaseAll();
    }
}

Netty version

Regression introduced in 4.2.15.Final via #16837 (backport of #15053). 4.2.14.Final and
4.1.135.Final behave as documented. The 4.1 branch contains the same rewrite via #16912
(unreleased at the time of writing) — verified: the reproducer also fails on the current 4.1
branch HEAD, so the next 4.1.x release will be affected too.

JVM version (e.g. java -version)

Reproduced on OpenJDK 21 (Amazon Corretto 21.0.11) — not JVM-specific.

OS version (e.g. uname -a)

Reproduced on both macOS 15 (NIO transport) and Linux (epoll transport, GitHub Actions
ubuntu-latest) — not transport-specific; the changed code is in the common handler module.

Additional context

If the new up-front dequeueAll behavior is intentional for the auto-read-enabled case
("behave transparently when auto-read is on"), it would be worth considering that the only
way a downstream handler can request a pause is from inside channelRead — at which point
auto-read was necessarily still enabled when the drain started. With the current code there is
no way to stop the drain of already-queued messages, which contradicts the class javadoc's
documented usage example. A possible fix would be to re-check config.isAutoRead() in the
dequeue loop condition for the unlimited (maxConsume < 0) case, restoring the pre-4.2.15
behavior while keeping the channelReadComplete-related fixes from #15053.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions