Skip to content

Fix off-by-one boundary in lpEncodeBacklen() for 3 values#3601

Merged
enjoy-binbin merged 1 commit into
valkey-io:unstablefrom
fanpei91:unstable
May 1, 2026
Merged

Fix off-by-one boundary in lpEncodeBacklen() for 3 values#3601
enjoy-binbin merged 1 commit into
valkey-io:unstablefrom
fanpei91:unstable

Conversation

@fanpei91

@fanpei91 fanpei91 commented May 1, 2026

Copy link
Copy Markdown
Contributor

The function lpEncodeBacklen() uses <= 127 for the 1-byte case but < 16383, < 2097151, and < 268435455 for the subsequent cases. This means the exact values 16383, 2097151, and 268435455 (i.e. 2^14-1, 2^21-1, 2^28-1) unnecessarily use one extra byte than needed:

  • l < 1638316383 (2^14-1) uses 3 bytes instead of 2
  • l < 20971512097151 (2^21-1) uses 4 bytes instead of 3
  • l < 268435455268435455 (2^28-1) uses 5 bytes instead of 4

The decoding side (lpDecodeBacklen) is unaffected since it parses continuation bits continuously without discrete range checks.

This is a correctness issue and has no impact on data integrity since encoding and decoding use the same function boundaries, but it wastes up to 1 byte per affected entry.

The function lpEncodeBacklen() uses `<= 127` for the 1-byte case but
`< 16383`, `< 2097151`, and `< 268435455` for the subsequent cases.
This means the exact values 16383, 2097151, and 268435455 (i.e. 2^14-1,
2^21-1, 2^28-1) unnecessarily use one extra byte than needed.

For example, 16383 (0x3FFF) can be encoded in 2 bytes but is currently
encoded in 3 bytes due to the boundary check being off by one.

Signed-off-by: fanpei91 <fanpei91@gmail.com>

@enjoy-binbin enjoy-binbin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@enjoy-binbin enjoy-binbin merged commit 46d37e4 into valkey-io:unstable May 1, 2026
56 of 57 checks passed
@codecov

codecov Bot commented May 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.62%. Comparing base (cba0510) to head (38080c0).
⚠️ Report is 4 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3601      +/-   ##
============================================
- Coverage     76.64%   76.62%   -0.03%     
============================================
  Files           160      160              
  Lines         80475    80475              
============================================
- Hits          61684    61660      -24     
- Misses        18791    18815      +24     
Files with missing lines Coverage Δ
src/listpack.c 93.12% <100.00%> (ø)

... and 26 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lucasyonge pushed a commit that referenced this pull request May 11, 2026
The function lpEncodeBacklen() uses `<= 127` for the 1-byte case but `<
16383`, `< 2097151`, and `< 268435455` for the subsequent cases. This
means the exact values 16383, 2097151, and 268435455 (i.e. 2^14-1,
2^21-1, 2^28-1) unnecessarily use one extra byte than needed:

- `l < 16383` → `16383` (2^14-1) uses 3 bytes instead of 2
- `l < 2097151` → `2097151` (2^21-1) uses 4 bytes instead of 3
- `l < 268435455` → `268435455` (2^28-1) uses 5 bytes instead of 4

The decoding side (`lpDecodeBacklen`) is unaffected since it parses
continuation bits continuously without discrete range checks.

This is a correctness issue and has no impact on data integrity since
encoding and decoding use the same function boundaries, but it wastes up
to 1 byte per affected entry.

Signed-off-by: fanpei91 <fanpei91@gmail.com>
lucasyonge pushed a commit that referenced this pull request May 12, 2026
The function lpEncodeBacklen() uses `<= 127` for the 1-byte case but `<
16383`, `< 2097151`, and `< 268435455` for the subsequent cases. This
means the exact values 16383, 2097151, and 268435455 (i.e. 2^14-1,
2^21-1, 2^28-1) unnecessarily use one extra byte than needed:

- `l < 16383` → `16383` (2^14-1) uses 3 bytes instead of 2
- `l < 2097151` → `2097151` (2^21-1) uses 4 bytes instead of 3
- `l < 268435455` → `268435455` (2^28-1) uses 5 bytes instead of 4

The decoding side (`lpDecodeBacklen`) is unaffected since it parses
continuation bits continuously without discrete range checks.

This is a correctness issue and has no impact on data integrity since
encoding and decoding use the same function boundaries, but it wastes up
to 1 byte per affected entry.

Signed-off-by: fanpei91 <fanpei91@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants