Skip to content

ZSTD Support for Streaming Compression #3798

Draft
sarthakaggarwal97 wants to merge 37 commits into
valkey-io:unstablefrom
sarthakaggarwal97:streaming-compression-rio-pr-zstd-v1.5.7
Draft

ZSTD Support for Streaming Compression #3798
sarthakaggarwal97 wants to merge 37 commits into
valkey-io:unstablefrom
sarthakaggarwal97:streaming-compression-rio-pr-zstd-v1.5.7

Conversation

@sarthakaggarwal97

@sarthakaggarwal97 sarthakaggarwal97 commented May 20, 2026

Copy link
Copy Markdown
Contributor

Built on top #3531.

Zstd is great for compression ratio and would be useful for both replication compression and rdb compression. The save time worse as compared to LZ4 because it tries to compress much harder. The load time is faster because since it compressed harder, it has much lesser data to load from the disk.

Zstd is specially useful for replication compression across regions where network costs are quite significant as compared to compute. We want to save as much data as possible while transferring data over the regions.

zstd157_v4_on zstd157_v4_off zstd157_blockmesh_bars

Add streaming LZ4-backed RDB compression with rio decorators, stream envelope handling, integration changes, and the follow-up fixes and config cleanup needed on top of unstable.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
- Remove dead code: rdbIsValidMagic() and unused #include <string.h> in rdb.h
- Remove redundant first RIO_FLAG_SKIP_RDB_CHECKSUM set in rdbSaveInternal
- Remove unrelated changes: config_parse_depth, USE_FAST_FLOAT, write-make-settings
- Validate full 8-byte VKCS envelope in aof.c rdbFileUsesStreamingCompression
- Add SAFETY comment for rdbRioHasCorruptCompressedInput cast invariant
- Rename all snake_case identifiers to camelCase per Valkey conventions:
  types (compression_algo_t -> compressionAlgo, stream_compressor_t ->
  streamCompressor, compress_rio_t -> compressRio, etc.), functions
  (stream_writer_create -> streamWriterCreate, compress_rio_finish ->
  compressRioFinish, write_vkcs_envelope -> writeVkcsEnvelope, etc.),
  and static variables (compression_lz4_codec_impl -> compressionLz4CodecImpl)
- Drop _t suffix from all types to match Valkey convention
- Fix typo: streamWriterIsErrord -> streamWriterIsErrored
- Replace silent dummy buffer allocation with assert(needed > 0) in
  streamWriterEnsureOutBuf

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
* Address streaming RDB compression review

* Skip RDB CRC for streaming compression

* Remove brittle 32-bit compression unit test

---------

Co-authored-by: Sarthak Aggarwal <sarthagg@amazon.com>

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
- rdbSaveInternal: the old comment claimed per-string LZF was disabled
  'when algo != LZF', but the actual gate is RIO_FLAG_STREAMING_COMPRESSION
  on the wrapper rio. Standalone rios (DUMP, AOF rewrite, diskless) keep
  using LZF regardless of algo.
- rdbInputStreamPrepare: flag the synchronous probe IO so a future
  non-blocking caller (replication) doesn't accidentally block the loop.
- rdbRioHasCorruptCompressedInput: the cast is sound today only because
  one producer sets RIO_FLAG_STREAMING_DECOMPRESSION. Replace the
  'SAFETY' assertion with a note telling the next person to add a type
  discriminator before adding a second producer.
- rdbSaveRawString: minor wording cleanup on the per-string LZF gate.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Match the surrounding Valkey style: drop comments that restate the
code, drop the Ownership/Threading/Returns boilerplate from headers,
collapse repeated 'capacity-shortage is retriable' notes into one
explanation per function. Behavior is unchanged.

Net -285 lines; integration tests (21/21) and build are clean.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
@coderabbitai

coderabbitai Bot commented May 20, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 34811f4e-03a7-4d41-9810-9d8bd469c521

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@sarthakaggarwal97 sarthakaggarwal97 force-pushed the streaming-compression-rio-pr-zstd-v1.5.7 branch 3 times, most recently from 6054859 to 215df50 Compare May 21, 2026 00:08
@codecov

codecov Bot commented May 21, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.90414% with 151 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.18%. Comparing base (bee30d4) to head (23c1bbe).
⚠️ Report is 24 commits behind head on unstable.

Files with missing lines Patch % Lines
src/compression_stream.c 89.97% 37 Missing ⚠️
src/compression_rio.c 73.87% 29 Missing ⚠️
src/valkey-check-rdb.c 58.73% 26 Missing ⚠️
src/rio.c 67.18% 21 Missing ⚠️
src/rdb.c 85.56% 14 Missing ⚠️
src/compression.c 86.20% 8 Missing ⚠️
src/aof.c 70.00% 6 Missing ⚠️
src/compression_zstd.c 93.33% 5 Missing ⚠️
src/unit/test_compression.cpp 99.74% 3 Missing ⚠️
src/compression_lz4.c 97.40% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3798      +/-   ##
============================================
+ Coverage     76.82%   77.18%   +0.35%     
============================================
  Files           162      168       +6     
  Lines         81021    83083    +2062     
============================================
+ Hits          62248    64128    +1880     
- Misses        18773    18955     +182     
Files with missing lines Coverage Δ
src/config.c 78.93% <ø> (ø)
src/rdb.h 100.00% <ø> (ø)
src/rio.h 100.00% <ø> (ø)
src/server.h 100.00% <ø> (ø)
src/compression_lz4.c 97.40% <97.40%> (ø)
src/unit/test_compression.cpp 99.74% <99.74%> (ø)
src/compression_zstd.c 93.33% <93.33%> (ø)
src/aof.c 80.17% <70.00%> (-0.14%) ⬇️
src/compression.c 86.20% <86.20%> (ø)
src/rdb.c 77.22% <85.56%> (+0.14%) ⬆️
... and 4 more

... and 21 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

roshkhatri added a commit to roshkhatri/valkey that referenced this pull request May 28, 2026
Adds replication wire compression on top of valkey-io#3531 with lz4 as the first
supported codec for the incremental replication stream. The replication
stream from primary to replica is wrapped in a VKCS envelope (using
STREAM_KIND_REPL) and compressed as a single long-lived frame at the
per-replica buffer layer. Default behavior is unchanged with
'replcompression no'; existing replicas without the new capability stay
uncompressed.

Negotiation is per-replica via the existing PSYNC handshake; a new
REPLICA_CAPA_COMPRESSION capability lets each side opt in independently.
Compression runs inline on the IO thread that owns the replica's write
job; no dedicated compression thread, no IPC, no reordering. Optional
sticky thread affinity (lazy ownership + event-driven rebalance) keeps
the long-lived LZ4 frame state on a single IO thread for cache locality.

Configs:
  replcompression                   bool, default no
  repl-compression-thread-affinity  bool, default yes

Internal constants:
  REPLICA_CAPA_COMPRESSION         (1 << 4)
  REPL_COMPRESSION_ALGO            ALGO_LZ4
  REPL_COMPRESSION_LEVEL           0  (LZ4 fast mode)
  REPL_COMPRESSION_BATCH_LIMIT     1 MB raw input per dispatch
  REPL_STREAM_DECODER_OUTPUT_MAX   256 MB

INFO replication per-replica fields:
  compression=lz4, compressed_bytes, uncompressed_bytes,
  compression_ratio, compression_errors, compression_cpu_usec,
  debug_compression_pending_drains, debug_thread_switches

INFO replication server-level (replica side):
  repl_decompression_errors, repl_decompression_cpu_usec,
  repl_decompressed_bytes_total, repl_apply_cpu_usec,
  repl_apply_batches

CI adds a test-replication-compression job that runs the
replication-tagged integration tests with replcompression=yes to
exercise compression across the broader replication test surface.

Tests: 18 streamReader push-mode unit tests + 3 replCompression unit
tests + 27 integration tests.

Performance (BlockMesh tweets, 3M keys x ~315 byte JSON values, 1,073
MB uncompressed per replica, 30 clients, pipeline 50, 2 cross-region
replicas):
  LZ4 level 0 (default): 0.48 ratio, 52% bandwidth saved, 2.5s
                         compression CPU per replica, <1% throughput
                         overhead vs uncompressed baseline.
Affinity ON vs OFF: throughput unchanged (118.6K vs 118.1K keys/s) but
thread switches drop from ~800K to ~30 per replica.

ZSTD support follows in valkey-io#3798.

Related to valkey-io#3531.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
roshkhatri added a commit to roshkhatri/valkey that referenced this pull request May 28, 2026
Adds replication wire compression on top of valkey-io#3531 with lz4 as the first
supported codec for the incremental replication stream. The replication
stream from primary to replica is wrapped in a VKCS envelope (using
STREAM_KIND_REPL) and compressed as a single long-lived frame at the
per-replica buffer layer. Default behavior is unchanged with
'replcompression no'; existing replicas without the new capability stay
uncompressed.

Negotiation is per-replica via the existing PSYNC handshake; a new
REPLICA_CAPA_COMPRESSION capability lets each side opt in independently.
Compression runs inline on the IO thread that owns the replica's write
job; no dedicated compression thread, no IPC, no reordering. Optional
sticky thread affinity (lazy ownership + event-driven rebalance) keeps
the long-lived LZ4 frame state on a single IO thread for cache locality.

Configs:
  replcompression                   bool, default no
  repl-compression-thread-affinity  bool, default yes

Internal constants:
  REPLICA_CAPA_COMPRESSION         (1 << 4)
  REPL_COMPRESSION_ALGO            ALGO_LZ4
  REPL_COMPRESSION_LEVEL           0  (LZ4 fast mode)
  REPL_COMPRESSION_BATCH_LIMIT     1 MB raw input per dispatch
  REPL_STREAM_DECODER_OUTPUT_MAX   256 MB

INFO replication per-replica fields:
  compression=lz4, compressed_bytes, uncompressed_bytes,
  compression_ratio, compression_errors, compression_cpu_usec,
  debug_compression_pending_drains, debug_thread_switches

INFO replication server-level (replica side):
  repl_decompression_errors, repl_decompression_cpu_usec,
  repl_decompressed_bytes_total, repl_apply_cpu_usec,
  repl_apply_batches

CI adds a test-replication-compression job that runs the
replication-tagged integration tests with replcompression=yes to
exercise compression across the broader replication test surface.

Tests: 18 streamReader push-mode unit tests + 3 replCompression unit
tests + 27 integration tests.

Performance (BlockMesh tweets, 3M keys x ~315 byte JSON values, 1,073
MB uncompressed per replica, 30 clients, pipeline 50, 2 cross-region
replicas):
  LZ4 level 0 (default): 0.48 ratio, 52% bandwidth saved, 2.5s
                         compression CPU per replica, <1% throughput
                         overhead vs uncompressed baseline.
Affinity ON vs OFF: throughput unchanged (118.6K vs 118.1K keys/s) but
thread switches drop from ~800K to ~30 per replica.

ZSTD support follows in valkey-io#3798.

Related to valkey-io#3531.

Signed-off-by: Roshan Khatri <rvkhatri@amazon.com>
---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Drop the clamp detail (an internal streamReaderCreate concern) and keep
only the caller-facing contract that buffer_size must be nonzero.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
The loaderr path exits the process, so resetting reading_config_file
there is dead code. It was a leftover from the reverted config_parse_depth
change and is not present upstream.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Collapse the save-notice if/else into one serverLog that always names the
active algorithm (none/lzf/lz4), instead of branching the message.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Replace em dashes in the compression comments with ASCII punctuation and
add a short note above streamWriterCreate describing the writer/reader
push/pull model and the finish-before-free requirement.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
- Simplify the streaming APIs: caller-owned structs with Init/Free and
  descriptive self-parameter names; codec used only for the algorithm
  vtable.
- Store the compression algorithm id directly in the VKCS envelope and
  drop the separate codec enum plus its translation functions.
- Make the rio decorator opaque: track connection-backed streams with a
  flag instead of a transport-type enum.
- Route LZ4 contexts through zmalloc and assert on caller errors; keep
  runtime errors for corrupt or external input.
- Document the feed functions' streaming contract, mark the retriable
  capacity-shortage returns, and drop a redundant probe re-zero.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
valkey-check-rdb reported offsets via rioTell() on every read, and
rdbLoadProgressCallback computed the transport offset via rioTell() on
every read even when no progress event was due. Read processed_bytes
directly for the offset, and only compute the transport position when a
progress event is actually reported.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
Replace the separate rdb-compression-algo config with a single
rdbcompression no|yes|lz4-stream. 'yes' keeps the legacy per-string LZF
path; 'lz4-stream' selects whole-file streaming LZ4 RDB saves.

Update valkey.conf and the RDB compression, check-rdb, and replication
AOF sync tests to use the new config value.

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
---------

Signed-off-by: Sarthak Aggarwal <sarthagg@amazon.com>
@sarthakaggarwal97 sarthakaggarwal97 force-pushed the streaming-compression-rio-pr-zstd-v1.5.7 branch from 215df50 to 5e484e7 Compare June 23, 2026 15:28
@sarthakaggarwal97 sarthakaggarwal97 requested a review from hpatro June 23, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants