Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Dec 12, 2025

What changes were proposed in this pull request?

Add LZ4TPCDSDataBenchmark, test LZ4CompressionCodec against TPCDS catalog_sales.dat (SF1), the size is about 283M.

Why are the changes needed?

Add a benchmark to measure the perf impact of lz4 security upgrading.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added benchmark result. Since the change has a refactor that touched ZSTDTPCDSDataBenchmark, also updated its benchmark results.

Was this patch authored or co-authored using generative AI tooling?

No.

@pan3793
Copy link
Member Author

pan3793 commented Dec 12, 2025

TL;DR - my test results show lz4-java 1.10.1 is about 10~15% slower on lz4 compression than 1.8.0, and is about 5% slower on lz4 decompression even with migrating to suggested safeDecompressor (#53454)

Here is my local test result:

lz4-java 1.8.0 with fastDecompressor (state before aa65bda)

[info] Running benchmark: Benchmark LZ4CompressionCodec
[info]   Running case: Compression 4 times
[info]   Stopped after 2 iterations, 4143 ms
[info] OpenJDK 64-Bit Server VM 17.0.15+6-LTS on Linux 6.17.9-76061709-generic
[info] Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
[info] Benchmark LZ4CompressionCodec:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Compression 4 times                                2064           2072          11          0.0   515881622.3       1.0X
[info] Running benchmark: Benchmark LZ4CompressionCodec
[info]   Running case: Decompression 4 times
[info]   Stopped after 3 iterations, 2659 ms
[info] OpenJDK 64-Bit Server VM 17.0.15+6-LTS on Linux 6.17.9-76061709-generic
[info] Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
[info] Benchmark LZ4CompressionCodec:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Decompression 4 times                               843            886          45          0.0   210840272.0       1.0X

lz4-java 1.10.1 with fastDecompressor (current master state, significant decompress perf drop!!!)

[info] Running benchmark: Benchmark LZ4CompressionCodec
[info]   Running case: Compression 4 times
[info]   Stopped after 2 iterations, 4740 ms
[info] OpenJDK 64-Bit Server VM 17.0.15+6-LTS on Linux 6.17.9-76061709-generic
[info] Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
[info] Benchmark LZ4CompressionCodec:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Compression 4 times                                2351           2370          27          0.0   587850958.3       1.0X
[info] Running benchmark: Benchmark LZ4CompressionCodec
[info]   Running case: Decompression 4 times
[info]   Stopped after 2 iterations, 4147 ms
[info] OpenJDK 64-Bit Server VM 17.0.15+6-LTS on Linux 6.17.9-76061709-generic
[info] Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
[info] Benchmark LZ4CompressionCodec:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Decompression 4 times                              2070           2074           5          0.0   517477273.3       1.0X

lz4-java 1.10.1 with safeDecompressor (#53454)

[info] Running benchmark: Benchmark LZ4CompressionCodec
[info]   Running case: Compression 4 times
[info]   Stopped after 2 iterations, 4729 ms
[info] OpenJDK 64-Bit Server VM 17.0.15+6-LTS on Linux 6.17.9-76061709-generic
[info] Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
[info] Benchmark LZ4CompressionCodec:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Compression 4 times                                2360           2365           7          0.0   589977281.7       1.0X
[info] Running benchmark: Benchmark LZ4CompressionCodec
[info]   Running case: Decompression 4 times
[info]   Stopped after 3 iterations, 2742 ms
[info] OpenJDK 64-Bit Server VM 17.0.15+6-LTS on Linux 6.17.9-76061709-generic
[info] Intel(R) Core(TM) i5-9500 CPU @ 3.00GHz
[info] Benchmark LZ4CompressionCodec:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] Decompression 4 times                               886            914          43          0.0   221537120.3       1.0X

@pan3793 pan3793 marked this pull request as ready for review December 15, 2025 02:43
@pan3793
Copy link
Member Author

pan3793 commented Dec 15, 2025

@LuciferYang, I did a refactor to make LZ4TPCDSDataBenchmark and ZStandardTPCDSDataBenchmark share the common code, also added the benchmard result generated by CI, could you please take another look?

/**
* Any code before running any benchmark, e.g., data preparation
*/
def beforeAll(): Unit = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is not to add the beforeAll() method in the current pr If we find it to be widely applicable, we can add this method in a subsequent pr and also extract beforeAll() for more microbenchmark scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in f02e559

@LuciferYang
Copy link
Contributor

Merged into master. Thanks @pan3793

@dongjoon-hyun
Copy link
Member

Although I didn't get a chance to take a look at the detail, thank you so much for adding additional benchmark, @pan3793 and @LuciferYang !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants