Skip to content

Improve Direct Buffer accesses via VarHandle#15504

Merged
chrisvest merged 1 commit into
netty:4.2from
franz1981:4.2_varhandle_improving_buffers
Aug 7, 2025
Merged

Improve Direct Buffer accesses via VarHandle#15504
chrisvest merged 1 commit into
netty:4.2from
franz1981:4.2_varhandle_improving_buffers

Conversation

@franz1981

Copy link
Copy Markdown
Contributor

Motivation:

With Unsafe gone the existing Direct's ByteBuf have too much overhead compared to the unsafe versions

Modifications:

Removed redundant bound checks and byte swapping conversions by using VarHandle

Result:

Faster safe Direct ByteBuffers (Fixes #15503)

@franz1981

Copy link
Copy Markdown
Contributor Author

This is blocked by #15486

@franz1981

Copy link
Copy Markdown
Contributor Author

@normanmaurer @chrisvest I see that adaptive is using mostly unpooled direct ByteBuf, whilst the "old" allocator PooledDirectByteBuf - which means that I should "fix" that one too - let me know wdyt?

@franz1981

franz1981 commented Jul 18, 2025

Copy link
Copy Markdown
Contributor Author

The improvements, so far (at b2824591d322770f81f633fff11eb65142de4628) are:

                                                     BEFORE         |       AFTER
Benchmark                                      Score   Error  Units | Score   Error  Units 
ByteBufAccessBenchmark.getBytes                7.020 ± 0.136  ns/op | 4.573 ± 0.135  ns/op
ByteBufAccessBenchmark.getBytesConstantOffset  5.856 ± 0.296  ns/op | 4.853 ± 0.029  ns/op
ByteBufAccessBenchmark.readBytes               6.708 ± 0.030  ns/op | 5.694 ± 0.097  ns/op
ByteBufAccessBenchmark.setBytes                9.082 ± 0.097  ns/op | 6.149 ± 0.076  ns/op
ByteBufAccessBenchmark.setBytesConstantOffset  7.143 ± 0.069  ns/op | 5.195 ± 0.340  ns/op

whilst the batchy numbers are still far behind Unsafe (reported in the related issue) because we now have an unrolling of 4 instead of 8.

Comment thread buffer/src/main/java/io/netty/buffer/UnpooledDirectByteBuf.java Dismissed
@franz1981

franz1981 commented Jul 18, 2025

Copy link
Copy Markdown
Contributor Author

@chrisvest
The funny thing is: if we have cleaners which wraps MemorySegment - and under the hood NIO Direct buffers uses such - maybe we could just do the same and re-use the one in the clearner, slicing it and using it instead of the NIO buffer.

I have to perform few benchmarks vs MemorySegment but I can see already that some of the benchmarks I did requires reading buffer -> segment + limit + capacity

@franz1981

franz1981 commented Jul 18, 2025

Copy link
Copy Markdown
Contributor Author

@normanmaurer @chrisvest for this, do I need to care about the jemalloc pooled buffers as well?

Fyi I have tried before/after just with JDK 24 but it could be worthy to try with 11/17 as well - which meant doesn't have that good varHandle optimizations.
Right now this improve the adaptive case only and in theory I can push this beyond what it does by having a segment based buffer too

@chrisvest

Copy link
Copy Markdown
Member

do I need to care about the jemalloc pooled buffers as well?

I think it's fine to at least do them one at a time, in separate PRs. The pooled allocator will be used for cases where the adaptive allocator turns out to be a poor fit, and as an intermediate step in upgrading systems from 4.1 to 4.2.

@normanmaurer

Copy link
Copy Markdown
Member

@franz1981 please rebase

Motivation:

With Unsafe gone the existing Direct's ByteBuf have too much overhead compared to the unsafe versions

Modifications:

Removed redundant bound checks and byte swapping conversions by using VarHandle

Result:

Faster safe Direct ByteBuffers (Fixes netty#15503)
@franz1981 franz1981 force-pushed the 4.2_varhandle_improving_buffers branch from b282459 to 533cd3b Compare July 21, 2025 00:53
@franz1981 franz1981 marked this pull request as ready for review July 21, 2025 00:53
@franz1981

Copy link
Copy Markdown
Contributor Author

Done ptal @normanmaurer

@franz1981

Copy link
Copy Markdown
Contributor Author

If this look good @chrisvest we can go for it?

@chrisvest chrisvest merged commit 15b8bbf into netty:4.2 Aug 7, 2025
18 checks passed
@chrisvest

Copy link
Copy Markdown
Member

Thanks, Franz!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Safe Direct Buffer accesses are not as fast as Unsafe ones

4 participants