Skip to content

205 random access data performance clean v2#605

Merged
anthony-swirldslabs merged 4 commits intohashgraph:mainfrom
kaldun-tech:205-RandomAccessData-performance-clean-v2
Sep 17, 2025
Merged

205 random access data performance clean v2#605
anthony-swirldslabs merged 4 commits intohashgraph:mainfrom
kaldun-tech:205-RandomAccessData-performance-clean-v2

Conversation

@kaldun-tech
Copy link
Copy Markdown
Contributor

@kaldun-tech kaldun-tech commented Sep 11, 2025

Description:

Adds benchmarks and corrects performance issues in DirectBufferedData.contains()

Related issue(s):

Fixes #205

Notes for reviewer:
Added new performance benchmark to test with "./gradlew :pbj-runtime:jmh -Pjmh.include=".*DirectBufferedData.contains."

Addressed the small array regression by increasing BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on comprehensive performance testing.

09-16-25 Results of RandomAccessDataContainsBench before code changes to DirectedBufferedData (see branch 205-RandomAccessData-performance-benchmark):
image

09-16-25 Updated results of benchmarks with code changes:
image

This new test is run from pbj-core/pbj-runtime: java -cp "build/classes/java/main:build/classes/java/test" com.hedera.pbj.runtime.io.buffer.DirectBufferedDataThresholdTest
It measures the optimal threshold for DirectBufferedData.contains() and is much faster than running the full JMH benchmarks for threshold evaluation. This indicates that the previous 8 byte threshold had inconsistent performance. A 32 byte threshold provides the best overall balance for typical usage patterns. Anomalies may happen in a particular run of the test due to caching behavior or memory allocation.

** Key Improvements:

  • Small arrays (≤32 bytes): Now use byte-by-byte comparison (no allocation overhead)
  • Large arrays (64+ bytes): Maintain bulk optimization benefits (3.28x speedup for 256 bytes)
  • Eliminates allocation overhead for the problematic 8-16 byte range noted in code review

Threshold Analysis:

The 32-byte threshold provides optimal balance. The threshold analysis table shows performance at different thresholds, demonstrating why 32 bytes was chosen over the original 8 bytes.

  • 4-32 byte patterns: Fast byte-by-byte comparison
  • 64+ byte patterns: Efficient bulk operations
  • Addresses GC concerns by avoiding frequent small allocations
image

Checklist

  • Documented (Code comments, README, etc.)
  • Tested (unit, integration, etc.)

@kaldun-tech kaldun-tech requested review from a team as code owners September 11, 2025 22:00
@kaldun-tech
Copy link
Copy Markdown
Contributor Author

kaldun-tech commented Sep 11, 2025

The DCO seems to still be having issues. I might need a walkthrough of how to do this correctly. EDIT: Got it. Had to dig into the docs further

@kaldun-tech kaldun-tech force-pushed the 205-RandomAccessData-performance-clean-v2 branch from 997338d to 4440f8b Compare September 12, 2025 01:59
Copy link
Copy Markdown
Contributor

@anthony-swirldslabs anthony-swirldslabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Overall, the fix looks good to me. But I posted a few comments. Please have a look.

kaldun-tech added a commit to kaldun-tech/hedera-pbj that referenced this pull request Sep 16, 2025
…nsive benchmarks

  Addresses code review feedback on PR hashgraph#605 by improving threshold analysis and resolving small array performance regression.

  Changes:
  - Increase BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on performance testing
  - Add comprehensive benchmarks for pattern sizes 4, 8, 16, 32, 64, 128, 256 bytes
  - Add original implementation benchmarks for baseline comparison
  - Create DirectBufferedDataThresholdTest utility for quick threshold analysis
  - Update both contains() method variants to use consistent threshold

  Performance results with 32-byte threshold:
  - Small arrays (≤32 bytes): Use byte-by-byte comparison (no allocation overhead)
  - Large arrays (64+ bytes): Use bulk optimization (up to 5.6x speedup for 256 bytes)
  - Eliminates small array performance regression noted in code review
@kaldun-tech kaldun-tech force-pushed the 205-RandomAccessData-performance-clean-v2 branch from 4440f8b to e000088 Compare September 16, 2025 21:03
@kaldun-tech
Copy link
Copy Markdown
Contributor Author

@anthony-swirldslabs I believe your concerns are now correctly addressed

Copy link
Copy Markdown
Contributor

@anthony-swirldslabs anthony-swirldslabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaldun-tech :

@anthony-swirldslabs I believe your concerns are now correctly addressed

Great work. Thank you! The code changes look good to me. Could you please fix the DCO check one last time so that we can merge the PR?

kaldun-tech added a commit to kaldun-tech/hedera-pbj that referenced this pull request Sep 16, 2025
…nsive benchmarks

  Addresses code review feedback on PR hashgraph#605 by improving threshold analysis and resolving small array performance regression.

  Changes:
  - Increase BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on performance testing
  - Add comprehensive benchmarks for pattern sizes 4, 8, 16, 32, 64, 128, 256 bytes
  - Add original implementation benchmarks for baseline comparison
  - Create DirectBufferedDataThresholdTest utility for quick threshold analysis
  - Update both contains() method variants to use consistent threshold

  Performance results with 32-byte threshold:
  - Small arrays (≤32 bytes): Use byte-by-byte comparison (no allocation overhead)
  - Large arrays (64+ bytes): Use bulk optimization (up to 5.6x speedup for 256 bytes)
  - Eliminates small array performance regression noted in code review

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
@kaldun-tech kaldun-tech force-pushed the 205-RandomAccessData-performance-clean-v2 branch from e000088 to e268248 Compare September 16, 2025 23:44
kaldun-tech added a commit to kaldun-tech/hedera-pbj that referenced this pull request Sep 17, 2025
…nsive benchmarks

  Addresses code review feedback on PR hashgraph#605 by improving threshold analysis and resolving small array performance regression.

  Changes:
  - Increase BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on performance testing
  - Add comprehensive benchmarks for pattern sizes 4, 8, 16, 32, 64, 128, 256 bytes
  - Add original implementation benchmarks for baseline comparison
  - Create DirectBufferedDataThresholdTest utility for quick threshold analysis
  - Update both contains() method variants to use consistent threshold

  Performance results with 32-byte threshold:
  - Small arrays (≤32 bytes): Use byte-by-byte comparison (no allocation overhead)
  - Large arrays (64+ bytes): Use bulk optimization (up to 5.6x speedup for 256 bytes)
  - Eliminates small array performance regression noted in code review

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
@kaldun-tech kaldun-tech force-pushed the 205-RandomAccessData-performance-clean-v2 branch from e268248 to 47484ac Compare September 17, 2025 00:44
@kaldun-tech
Copy link
Copy Markdown
Contributor Author

DCO should be fixed now

Copy link
Copy Markdown
Contributor

@anthony-swirldslabs anthony-swirldslabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

kaldun-tech added a commit to kaldun-tech/hedera-pbj that referenced this pull request Sep 17, 2025
…nsive benchmarks

  Addresses code review feedback on PR hashgraph#605 by improving threshold analysis and resolving small array performance regression.

  Changes:
  - Increase BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on performance testing
  - Add comprehensive benchmarks for pattern sizes 4, 8, 16, 32, 64, 128, 256 bytes
  - Add original implementation benchmarks for baseline comparison
  - Create DirectBufferedDataThresholdTest utility for quick threshold analysis
  - Update both contains() method variants to use consistent threshold

  Performance results with 32-byte threshold:
  - Small arrays (≤32 bytes): Use byte-by-byte comparison (no allocation overhead)
  - Large arrays (64+ bytes): Use bulk optimization (up to 5.6x speedup for 256 bytes)
  - Eliminates small array performance regression noted in code review

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
@kaldun-tech kaldun-tech force-pushed the 205-RandomAccessData-performance-clean-v2 branch from 613c682 to 40d0459 Compare September 17, 2025 22:03
anthony-swirldslabs pushed a commit to kaldun-tech/hedera-pbj that referenced this pull request Sep 17, 2025
…nsive benchmarks

  Addresses code review feedback on PR hashgraph#605 by improving threshold analysis and resolving small array performance regression.

  Changes:
  - Increase BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on performance testing
  - Add comprehensive benchmarks for pattern sizes 4, 8, 16, 32, 64, 128, 256 bytes
  - Add original implementation benchmarks for baseline comparison
  - Create DirectBufferedDataThresholdTest utility for quick threshold analysis
  - Update both contains() method variants to use consistent threshold

  Performance results with 32-byte threshold:
  - Small arrays (≤32 bytes): Use byte-by-byte comparison (no allocation overhead)
  - Large arrays (64+ bytes): Use bulk optimization (up to 5.6x speedup for 256 bytes)
  - Eliminates small array performance regression noted in code review

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
@anthony-swirldslabs anthony-swirldslabs force-pushed the 205-RandomAccessData-performance-clean-v2 branch from 40d0459 to 0b93fdc Compare September 17, 2025 22:14
Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
  Refactored common code into bulkContains helper method for efficient
  memory operations on larger data sets (>8 bytes). Uses UnsafeUtils
  bulk operations instead of byte-by-byte comparison.

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
…nsive benchmarks

  Addresses code review feedback on PR hashgraph#605 by improving threshold analysis and resolving small array performance regression.

  Changes:
  - Increase BULK_COMPARISON_THRESHOLD from 8 to 32 bytes based on performance testing
  - Add comprehensive benchmarks for pattern sizes 4, 8, 16, 32, 64, 128, 256 bytes
  - Add original implementation benchmarks for baseline comparison
  - Create DirectBufferedDataThresholdTest utility for quick threshold analysis
  - Update both contains() method variants to use consistent threshold

  Performance results with 32-byte threshold:
  - Small arrays (≤32 bytes): Use byte-by-byte comparison (no allocation overhead)
  - Large arrays (64+ bytes): Use bulk optimization (up to 5.6x speedup for 256 bytes)
  - Eliminates small array performance regression noted in code review

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
  Fixes Windows CRLF line endings that were inadvertently introduced
  in recent performance optimization commits.

Signed-off-by: kaldun-tech <tsmereka@protonmail.com>
Signed-off-by: Anthony Petrov <anthony@swirldslabs.com>
@anthony-swirldslabs anthony-swirldslabs force-pushed the 205-RandomAccessData-performance-clean-v2 branch from 0b93fdc to 314a5cf Compare September 17, 2025 22:22
@anthony-swirldslabs anthony-swirldslabs merged commit f3f3eb3 into hashgraph:main Sep 17, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve performance: RandomAccessData.contains/matchesPrefix

2 participants