Eliminate redundant bounds checks in CompositeByteBuf accessors#16525
Eliminate redundant bounds checks in CompositeByteBuf accessors#16525normanmaurer merged 2 commits intonetty:4.2from
Conversation
a621787 to
8847e11
Compare
Here we go:
|
|
@yawkat It's a tiny improvement, but for free, basically ^^ |
Motivation: Every _getXxx/_setXxx in CompositeByteBuf delegates to the underlying buffer's public API (e.g. c.buf.getByte()), which re-checks bounds that the composite level already validated. Additionally, sequential readByte() calls pay a binary search cost on every component boundary crossing, since findComponent0 only caches the last accessed Component reference without its array index. Modifications: - Add an `abuf` field to Component that caches `(AbstractByteBuf) buf` at construction time, or null for non-AbstractByteBuf wrappers - Use abuf._getXxx() (no bounds check) when available, fall back to buf.getXxx() otherwise - Add sequentialReadBytes and sequentialGetBytes JMH benchmarks Result: Redundant bounds checks eliminated on the hot path.
8847e11 to
b45c2a5
Compare
|
Here we go, another round of improvements - this time for multi-byte requires a bit more work so probably not worthy. |
|
|
||
| // weak cache - check it first when looking for component | ||
| private Component lastAccessed; | ||
| private int lastAccessedIndex; |
There was a problem hiding this comment.
JOL says this will fix into existing wasted space, so basically free ^^
| this.srcBuf = srcBuf; | ||
| this.srcAdjustment = srcOffset - offset; | ||
| this.buf = buf; | ||
| this.abuf = buf instanceof AbstractByteBuf ? (AbstractByteBuf) buf : null; |
There was a problem hiding this comment.
this is going to use the most optimized bound-check free version of each getXYZ access
|
@normanmaurer ready to go for me: I don't plan to put much more efforts here yet, as with minimal changes I see some measurable improvement, so, good ROI |
Motivation: Sequential readByte() calls inherited from AbstractByteBuf pay a binary search cost on every component boundary crossing. For composites with many small components (e.g. decompressor output), this is the dominant cost — async-profiler shows 43% of CPU in findIt(). Modifications: - Add lastAccessedIndex field to track the array position of the cached component - Override readByte() to use lastAccessed cache directly, skipping the findComponent0 indirection - Add findComponentForRead() that advances to the next non-empty component in O(1) instead of binary search, falling back to findIt() only on random jumps - Add direct buffer variants to the sequential benchmark Result: Sequential readByte() throughput improved by up to 61% for small-chunk composites (SMALL_CHUNKS 10240, JDK 25).
b45c2a5 to
84ceadc
Compare
|
PTAL @normanmaurer this should be enough for now. Mostly low hanging fruits ;) |
Motivation: Every _getXxx/_setXxx in CompositeByteBuf delegates to the underlying buffer's public API (e.g. c.buf.getByte()), which re-checks bounds that the composite level already validated. Modifications: - Add an `abuf` field to Component that caches `(AbstractByteBuf) buf` at construction time, or null for non-AbstractByteBuf wrappers - Use abuf._getXxx() (no bounds check) when available, fall back to buf.getXxx() otherwise - Add sequentialReadBytes and sequentialGetBytes JMH benchmarks Result: Redundant bounds checks eliminated on the hot path. (cherry picked from commit 6001499)
|
Auto-port PR for 5.0: #16534 |
…accessors (#16534) Auto-port of #16525 to 5.0 Cherry-picked commit: 6001499 --- Motivation: Every _getXxx/_setXxx in CompositeByteBuf delegates to the underlying buffer's public API (e.g. c.buf.getByte()), which re-checks bounds that the composite level already validated. Modifications: - Add an `abuf` field to Component that caches `(AbstractByteBuf) buf` at construction time, or null for non-AbstractByteBuf wrappers - Use abuf._getXxx() (no bounds check) when available, fall back to buf.getXxx() otherwise - Add sequentialReadBytes and sequentialGetBytes JMH benchmarks Result: Redundant bounds checks eliminated on the hot path. Co-authored-by: Francesco Nigro <nigro.fra@gmail.com>
Motivation:
Every _getXxx/_setXxx in CompositeByteBuf delegates to the underlying buffer's public API (e.g. c.buf.getByte()), which re-checks bounds that the composite level already validated.
Modifications:
abuffield to Component that caches(AbstractByteBuf) bufat construction time, or null for non-AbstractByteBuf wrappersResult:
Redundant bounds checks eliminated on the hot path.