Skip to content

Close Channel and fail bootstrap when setting a ChannelOption causes …#15896

Merged
normanmaurer merged 3 commits into
4.2from
close_on_failure
Dec 3, 2025
Merged

Close Channel and fail bootstrap when setting a ChannelOption causes …#15896
normanmaurer merged 3 commits into
4.2from
close_on_failure

Conversation

@normanmaurer

@normanmaurer normanmaurer commented Nov 19, 2025

Copy link
Copy Markdown
Member

…an error

Motivation:

We should close the Channel and fail the future of the bootstrap if during setting a ChannelOption we observe an error. At the moment we just log which might make things hard to debug and leave the Channel in an unexpected state.

Modifications:

  • Adjust code to also close and faile the future
  • Add testcases

Result:

Related to #15860 (comment)

@normanmaurer normanmaurer added this to the 4.2.8.Final milestone Nov 19, 2025
@normanmaurer

Copy link
Copy Markdown
Member Author

I think we should also cherry-pick this into 4.1

…an error

Motivation:

We should close the Channel and fail the future of the bootstrap if during setting a ChannelOption we observe an error. At the moment we just log which might make things hard to debug.

Modifications:

- Adjust code to also close and faile the future
- Add testcases

Result:

Related to #15860 (comment)
@WhiteTrashLord

Copy link
Copy Markdown

Debugging is not really the issue. The main problem is that Netty will continue even if one or more of the channel options couldn't be set. Depending on the channel option this may have undesired consequences.

@normanmaurer

Copy link
Copy Markdown
Member Author

@WhiteTrashLord reworded... Thanks

Comment thread transport/src/main/java/io/netty/bootstrap/AbstractBootstrap.java Outdated
@He-Pin

He-Pin commented Nov 19, 2025

Copy link
Copy Markdown
Contributor

practical , but may cause problem in production,there are many wrong configured jars

@WhiteTrashLord

Copy link
Copy Markdown

practical , but may cause problem in production,there are many wrong configured jars

Yeah, actually I was a bit surprised that Norman accepted this change so eagerly :D. This change makes a lot of sense and I really need it. But there is a high chance that it will break things for some users who have ignored warnings so far.

Maybe this can be the default behavior in Netty 5. For me it would be okay to just add a System property like io.netty.channel.throwChannelOptionException in Netty 4.1 and 4.2.

io.netty.channel.throwChannelOptionException would be false by default

The code would look like this:

if ( PROP_THROWCHANNELOPTIONEXCEPTION == true )
{
throw t;
}
else
{
logger.warn(
"Failed to set channel option '{}' with value '{}' for channel '{}' of type '{}'",
option, value, channel, channel.getClass(), t);
}

@normanmaurer

Copy link
Copy Markdown
Member Author

@WhiteTrashLord honestly there should be no breakage and if there is it is for a good reason imho

@normanmaurer

Copy link
Copy Markdown
Member Author

@chrisvest @yawkat @franz1981 @bryce-anderson do you think we should better add a property to fallback to the old behavior so people can set it when they run into issues? I still think the default should be changed to what we have here now tho.

@normanmaurer

Copy link
Copy Markdown
Member Author

And also @trustin ^^

@franz1981

Copy link
Copy Markdown
Contributor

It's 4.2 so I would:

  • left the new behavior opt in
  • then just force the new behavior leaving an option for the old
  • eventually fully remove the old

This in 3-4 ticks releases

@WhiteTrashLord

Copy link
Copy Markdown

If you google for "Failed to set channel option" netty you can find a few results like https://groups.google.com/g/vertx/c/pR89FCWeYJA

I would do it like this:
Make this the new standard behavior in 4.2 since this version is just a few months old.
Adding a system property for 4.1

@chrisvest

Copy link
Copy Markdown
Member

@chrisvest @yawkat @franz1981 @bryce-anderson do you think we should better add a property to fallback to the old behavior so people can set it when they run into issues? I still think the default should be changed to what we have here now tho.

I agree with this. Add a property to revert to 4.1 behavior. I don't think we need to stagger the change of the default, because no one will pay attention to the changed default until it impacts them, anyway. The few teams who do pay attention will notice our call-out in the release notes.

@normanmaurer

Copy link
Copy Markdown
Member Author

Added a property to restore old behavior but close on failure by default

@chrisvest chrisvest left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@normanmaurer normanmaurer merged commit 4643540 into 4.2 Dec 3, 2025
47 of 52 checks passed
@normanmaurer normanmaurer deleted the close_on_failure branch December 3, 2025 19:41
@normanmaurer normanmaurer added needs-cherry-pick-4.1 This PR should be cherry-picked to 4.1 once merged. needs-cherry-pick-5.0 This PR should be cherry-picked to 5.0 once merged. labels Dec 3, 2025
normanmaurer added a commit that referenced this pull request Dec 3, 2025
#15896)

…an error

Motivation:

We should close the Channel and fail the future of the bootstrap if
during setting a ChannelOption we observe an error. At the moment we
just log which might make things hard to debug and leave the Channel in
an unexpected state.

People can go back to the old behavior by using `-Dio.netty.bootstrap.closeOnSetOptionFailure=false`.

Modifications:

- Adjust code to also close and faile the future
- Add testcases

Result:

Related to
#15860 (comment)
normanmaurer added a commit that referenced this pull request Dec 4, 2025
#15970)

…… (#15896)

…an error

Motivation:

We should close the Channel and fail the future of the bootstrap if
during setting a ChannelOption we observe an error. At the moment we
just log which might make things hard to debug and leave the Channel in
an unexpected state.

People can go back to the old behavior by using
`-Dio.netty.bootstrap.closeOnSetOptionFailure=false`.

Modifications:

- Adjust code to also close and faile the future
- Add testcases

Result:

Related to
#15860 (comment)
normanmaurer added a commit that referenced this pull request Dec 4, 2025
#15896)

…an error

Motivation:

We should close the Channel and fail the future of the bootstrap if
during setting a ChannelOption we observe an error. At the moment we
just log which might make things hard to debug and leave the Channel in
an unexpected state.

People can go back to the old behavior by using `-Dio.netty.bootstrap.closeOnSetOptionFailure=false`.

Modifications:

- Adjust code to also close and faile the future
- Add testcases

Result:

Related to
#15860 (comment)
normanmaurer added a commit that referenced this pull request Dec 4, 2025
#15972)

…… (#15896)

…an error

Motivation:

We should close the Channel and fail the future of the bootstrap if
during setting a ChannelOption we observe an error. At the moment we
just log which might make things hard to debug and leave the Channel in
an unexpected state.

People can go back to the old behavior by using
`-Dio.netty.bootstrap.closeOnSetOptionFailure=false`.

Modifications:

- Adjust code to also close and faile the future
- Add testcases

Result:

Related to
#15860 (comment)
minwoox pushed a commit to line/armeria that referenced this pull request May 8, 2026
 Motivation:

  `ChannelUtil.applyDefaultChannelOptions()` overflows when `idleTimeoutMillis` is near `Long.MAX_VALUE`. The expression `idleTimeoutMillis + TCP_USER_TIMEOUT_BUFFER_MILLIS` (where the buffer is 5,000)
   wraps the `long` type to a large negative number, and `Ints.saturatedCast()` then clamps it to `Integer.MIN_VALUE` (-2,147,483,648).

  Linux rejects this negative value for `TCP_USER_TIMEOUT` via `setsockopt()` with `EINVAL`. Starting with **Netty 4.1.131+**, channel option values are validated at set time and throw exceptions
  instead of being silently logged internally (see netty/netty#15896). This means the overflow now causes a hard failure rather than a silent misconfiguration.

  Modifications:

  - Use `LongMath.saturatedAdd` to prevent the `long` overflow before passing the result to `Ints.saturatedCast`:
    ```java
    final int tcpUserTimeoutMillis =
            Ints.saturatedCast(LongMath.saturatedAdd(idleTimeoutMillis,
                                                     TCP_USER_TIMEOUT_BUFFER_MILLIS));
  - Add a regression test that passes Long.MAX_VALUE as idleTimeoutMillis and asserts the result is Integer.MAX_VALUE.

  Result:

  - Long.MAX_VALUE idle timeout no longer overflows — produces Integer.MAX_VALUE (~24.8 day timeout) instead of Integer.MIN_VALUE (rejected by Linux).
  - For normal values (e.g., 2000), Math.min is a no-op and behavior is unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-cherry-pick-4.1 This PR should be cherry-picked to 4.1 once merged. needs-cherry-pick-5.0 This PR should be cherry-picked to 5.0 once merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants