erasure-code/isa: fix cache collision causing buffer overflow by tchaikov · Pull Request #66894 · ceph/ceph

tchaikov · 2026-01-13T01:07:11Z

This commit fixes a critical cache key collision bug in the ISA erasure code plugin that could lead to heap-buffer-overflow and silent data corruption.

Problem:

The decoding table cache was indexed only by matrix type and erasure signature (available/missing chunk pattern), but did NOT include the (k,m) erasure code configuration parameters. This caused different EC configurations with similar erasure patterns to collide in the cache, leading to incorrectly-sized cached buffers being reused.

AddressSanitizer Report:

==4904==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5160001397b8 at pc 0x5de8e415296b bp 0x7ffc82260310 sp 0x7ffc8225fad0 READ of size 576 at 0x5160001397b8 thread T0
    #0 __asan_memcpy
    #1 ErasureCodeIsaTableCache::getDecodingTableFromCache()
       .../ErasureCodeIsaTableCache.cc:260:5
    #2 ErasureCodeIsaDefault::isa_decode()
       .../ErasureCodeIsa.cc:490:15

0x5160001397b8 is located 0 bytes after 568-byte region [0x516000139580,0x5160001397b8) allocated by:
    #0 posix_memalign
    #1 ceph::buffer::raw_combined::alloc_data_n_controlblock()
    #2 ErasureCodeIsaTableCache::putDecodingTableToCache()
       .../ErasureCodeIsaTableCache.cc:319:18

Root Cause:

Scenario illustrating the bug:

First decode operation: k=2, m=1, erasure pattern "+0+2-1"
- Creates cache entry with key "+0+2-1"
- Allocates buffer: 2*(1+2)*32 = 192 bytes
Second decode operation: k=3, m=3, same erasure pattern "+0+2-1"
- Looks up cache with key "+0+2-1" → COLLISION
- Retrieves 192-byte buffer but needs 3*(3+3)*32 = 576 bytes
- Result: Heap-buffer-overflow (reads 384 bytes beyond allocation)

Worse scenario (silent corruption):

First decode: k=3, m=3 → caches 576-byte table
Second decode: k=2, m=1 → retrieves wrong table
- Uses incorrect decoding matrix
- Result: Silent data corruption during recovery

Solution:

Include k and m parameters in cache signature

Old format: "+0+2+3-1-4"
New format: "k3m2a+0+2+3e-1-4"

Test Fix:

Also fixes a buffer overflow in TestErasureCodePlugins.cc where hashes_bl offset was calculated using chunk_size instead of sizeof(uint32_t), causing reads beyond the CRC buffer.

Production Impact:

Backward Compatibility: FULLY COMPATIBLE

Cache is ephemeral (in-memory only, not persisted)
Cache cleared on process restart
Rolling upgrades safe - each OSD restart gets fixed code
Old cache entries automatically invalidated on upgrade
No wire protocol or on-disk format changes
No configuration changes required
No breaking changes

Data Integrity: SIGNIFICANTLY IMPROVED

Eliminates silent data corruption risk
Eliminates heap-buffer-overflow crashes
Cache now correctly isolated by (k,m) configuration
Correct decoding tables always used for recovery
No risk of corrupting user data from the fix itself

Performance: IMPROVED

Fewer memory allocations (single buffer vs string concatenation)
No regression: Cache effectiveness maintained, just with correct keys
Negligible overhead: Signature generation tiny compared to EC operations

Why Users Haven't Complained:

Several factors likely prevented widespread reports:

Low probability conditions required:
- Need multiple EC pools with DIFFERENT (k,m) configurations
- Need similar erasure patterns across pools
- Need cache collision to occur during actual recovery operations
- Recovery operations are relatively rare in healthy clusters
Crash vs silent corruption detection:
- Buffer overflows (easier to detect) occur when k2,m2 > k1,m1
- Silent corruption (harder to detect) occurs when k2,m2 < k1,m1
- Crashes might be attributed to other causes
- Data corruption only detected during scrub or data verification
Common deployment patterns:
- Many deployments use single EC configuration cluster-wide
- Default EC configurations (k=2,m=1 or k=4,m=2) reduce collision space
- Erasure pattern variety may be insufficient for collisions
ISA plugin usage:
- Not universally deployed (requires Intel ISA-L library)
- Some sites use jerasure plugin instead
- Plugin selection depends on hardware and configuration
Detection difficulty:
- ASan not enabled in production builds
- Silent corruption only appears during:
  - Degraded reads with recovery
  - Scrub operations * Deep-scrub verification
- Corrupted data might not be immediately accessed

Fixes: https://tracker.ceph.com/issues/74382

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins test classic perf Jenkins Job | Jenkins Job Definition
jenkins test crimson perf Jenkins Job | Jenkins Job Definition
jenkins test signed Jenkins Job | Jenkins Job Definition
jenkins test make check Jenkins Job | Jenkins Job Definition
jenkins test make check arm64 Jenkins Job | Jenkins Job Definition
jenkins test submodules Jenkins Job | Jenkins Job Definition
jenkins test dashboard Jenkins Job | Jenkins Job Definition
jenkins test dashboard cephadm Jenkins Job | Jenkins Job Definition
jenkins test api Jenkins Job | Jenkins Job Definition
jenkins test docs ReadTheDocs | Github Workflow Definition
jenkins test ceph-volume all Jenkins Jobs | Jenkins Jobs Definition
jenkins test windows Jenkins Job | Jenkins Job Definition
jenkins test rook e2e Jenkins Job | Jenkins Job Definition

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

This commit fixes a critical cache key collision bug in the ISA erasure code plugin that could lead to heap-buffer-overflow and silent data corruption. Problem: -------- The decoding table cache was indexed only by matrix type and erasure signature (available/missing chunk pattern), but did NOT include the (k,m) erasure code configuration parameters. This caused different EC configurations with similar erasure patterns to collide in the cache, leading to incorrectly-sized cached buffers being reused. AddressSanitizer Report: ------------------------ ==4904==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5160001397b8 at pc 0x5de8e415296b bp 0x7ffc82260310 sp 0x7ffc8225fad0 READ of size 576 at 0x5160001397b8 thread T0 #0 __asan_memcpy #1 ErasureCodeIsaTableCache::getDecodingTableFromCache() .../ErasureCodeIsaTableCache.cc:260:5 #2 ErasureCodeIsaDefault::isa_decode() .../ErasureCodeIsa.cc:490:15 0x5160001397b8 is located 0 bytes after 568-byte region [0x516000139580,0x5160001397b8) allocated by: #0 posix_memalign #1 ceph::buffer::raw_combined::alloc_data_n_controlblock() #2 ErasureCodeIsaTableCache::putDecodingTableToCache() .../ErasureCodeIsaTableCache.cc:319:18 Root Cause: ----------- Scenario illustrating the bug: 1. First decode operation: k=2, m=1, erasure pattern "+0+2-1" - Creates cache entry with key "+0+2-1" - Allocates buffer: 2*(1+2)*32 = 192 bytes 2. Second decode operation: k=3, m=3, same erasure pattern "+0+2-1" - Looks up cache with key "+0+2-1" → COLLISION - Retrieves 192-byte buffer but needs 3*(3+3)*32 = 576 bytes - Result: Heap-buffer-overflow (reads 384 bytes beyond allocation) Worse scenario (silent corruption): 1. First decode: k=3, m=3 → caches 576-byte table 2. Second decode: k=2, m=1 → retrieves wrong table - Uses incorrect decoding matrix - Result: Silent data corruption during recovery Solution: --------- Include k and m parameters in cache signature - Old format: "+0+2+3-1-4" - New format: "k3m2a+0+2+3e-1-4" Test Fix: --------- Also fixes a buffer overflow in TestErasureCodePlugins.cc where hashes_bl offset was calculated using chunk_size instead of sizeof(uint32_t), causing reads beyond the CRC buffer. Production Impact: ------------------ Backward Compatibility: FULLY COMPATIBLE - Cache is ephemeral (in-memory only, not persisted) - Cache cleared on process restart - Rolling upgrades safe - each OSD restart gets fixed code - Old cache entries automatically invalidated on upgrade - No wire protocol or on-disk format changes - No configuration changes required - No breaking changes Data Integrity: - Eliminates silent data corruption risk - Eliminates heap-buffer-overflow crashes - Cache now correctly isolated by (k,m) configuration - Correct decoding tables always used for recovery - No risk of corrupting user data from the fix itself Why Users Haven't Complained: ------------------------------ Several factors likely prevented widespread reports: 1. Low probability conditions required: - Need multiple EC pools with DIFFERENT (k,m) configurations - Need similar erasure patterns across pools - Need cache collision to occur during actual recovery operations - Recovery operations are relatively rare in healthy clusters 2. Crash vs silent corruption detection: - Buffer overflows (easier to detect) occur when k2,m2 > k1,m1 - Silent corruption (harder to detect) occurs when k2,m2 < k1,m1 - Crashes might be attributed to other causes - Data corruption only detected during scrub or data verification 3. Common deployment patterns: - Many deployments use single EC configuration cluster-wide - Default EC configurations (k=2,m=1 or k=4,m=2) reduce collision space - Erasure pattern variety may be insufficient for collisions 4. ISA plugin usage: - Not universally deployed (requires Intel ISA-L library) - Some sites use jerasure plugin instead - Plugin selection depends on hardware and configuration 5. Detection difficulty: - ASan not enabled in production builds - Silent corruption only appears during: * Degraded reads with recovery * Scrub operations * Deep-scrub verification - Corrupted data might not be immediately accessed Fixes: https://tracker.ceph.com/issues/74382 Signed-off-by: Kefu Chai <k.chai@proxmox.com>

tchaikov · 2026-01-14T02:50:12Z

jenkins test api

tchaikov · 2026-01-14T05:03:43Z

jenkins test make check

tchaikov · 2026-01-14T05:03:55Z

jenkins test make check arm64

tchaikov · 2026-01-14T08:24:56Z

jenkins test make check arm64

tchaikov · 2026-01-14T09:21:32Z

jenkins test make check arm64

tchaikov · 2026-01-15T06:57:49Z

@ceph/core Hello maintainers, could you please review this PR when you have a chance? I believe this addresses a critical issue that may warrant prioritized attention. Thank you!

rzarzynski · 2026-01-26T20:43:17Z

src/erasure-code/isa/ErasureCodeIsa.cc

+    }
  }

+  std::string erasure_signature(sig_buf);


It's all about the uniqueness across instances of ErasureCodeIsaTableCache. The factory class ErasureCodePluginIsa has exactly one shared among all created ErasureCodeIsaDefault instances.

class ErasureCodePluginIsa : public ceph::ErasureCodePlugin { public: ErasureCodeIsaTableCache tcache; int factory(const std::string &directory, ceph::ErasureCodeProfile &profile, ceph::ErasureCodeInterfaceRef *erasure_code, std::ostream *ss) override; };

int ErasureCodePluginIsa::factory(const std::string &directory, ceph::ErasureCodeProfile &profile, ceph::ErasureCodeInterfaceRef *erasure_code, std::ostream *ss) { ErasureCodeIsa *interface; std::string technique; technique = profile.find("technique")->second; std::string _m = profile.find("m")->second; if ((technique == "reed_sol_van")) { interface = new ErasureCodeIsaDefault(tcache, technique, ErasureCodeIsaDefault::kVandermonde, _m); } else if ((technique == "cauchy")) { interface = new ErasureCodeIsaDefault(tcache, technique, ErasureCodeIsaDefault::kCauchy, _m); } else { *ss << "technique=" << technique << " is not a valid coding technique. " << " Choose one of the following: " << "reed_sol_van," << "cauchy" << std::endl; return -ENOENT; } int r = interface->init(profile, ss); if (r) { delete interface; return r; } *erasure_code = ceph::ErasureCodeInterfaceRef(interface); return 0; }

The alternative solution could be to grow the number of cache instances available.
The bug seems to very old, dating back to at least 2014.
Other, non-Jerasure plugins might be affected as well.

The bug seems to very old, dating back to at least 2014.

please see the tracker ticket, this issue was introduced in the day 0, so all branches are impacted.

Other, non-Jerasure plugins might be affected as well.

as you explained above and also see the tracker ticket, this caching machinery is specific to the ISA erasure code plugin. so only ISA erasure code plugin is impacted.

this caching machinery is specific to the ISA erasure code plugin. so only ISA erasure code plugin is impacted.

Yeah, Shec also has a global tcache but the naming schema is different than for ISA.

indeed. what shec's ErasureCodeShecTableCache is using is a combination of techinique, k, m, c and w. so it's immune to this problem.

see

ceph/src/erasure-code/shec/ErasureCodeShecTableCache.cc

Line 151 in 566f177

*encoding_table[technique][k][m][c][w] = ec_in_table;

jamiepryde · 2026-02-03T17:17:55Z

I have been trying to trigger the data corruption bug on a vstart cluster, but have not succeeded yet. A couple of points I noticed that I think reduce the chance of this bug being hit:

I think the value of K in the profiles must be the same for there to be a signature collision, as we always add K number of +%d to the signature. So e.g. a system with pools using 2+2 and 2+3 ISA-L profiles could hit the bug, but a system using 2+2 and 4+2 wouldn't. See

  for (i = 0, r = 0; i < k; i++, r++) {
    char id[128];
    while (erasure_contains(erasures, r))
      r++;

    decode_index[i] = r;

    snprintf(id, sizeof (id), "+%d", r);
    erasure_signature += id;
  }

We only use the cached decode tables, and so can only hit the bug if M > 1 and number of errors is > 1 (or M > 1 and the error is in a parity shard other than the first one). If M or number of errors is 1 (and the first parity shard) then we use ISA-L xor optimization instead. See


  if ((m == 1) || 
      ((matrixtype == kVandermonde) && (nerrs == 1) && (erasures[0] < (k + 1)))) {
    // single parity decoding
    dout(20) << "isa_decode: reconstruct using xor_gen [" << erasures[0] << "]" << dendl;
    isa_xor(recover_buf, recover_buf[k], blocksize, k);
    return 0;
  }

  unsigned char d[k * (m + k)];
  unsigned char decode_tbls[k * (m + k)*32];
  unsigned char *p_tbls = decode_tbls;

  int decode_index[k];

  std::string erasure_signature;

NitzanMordhai · 2026-02-15T15:19:53Z

Rados approved: https://tracker.ceph.com/projects/rados/wiki/MAIN#httpstrackercephcomissues74568

tchaikov requested a review from a team as a code owner January 13, 2026 01:07

github-actions bot added core tests labels Jan 13, 2026

tchaikov force-pushed the wip-ec-isa-fix-cache-collision branch 2 times, most recently from 12069bf to 8d27247 Compare January 13, 2026 01:29

tchaikov mentioned this pull request Jan 13, 2026

script/run-make: enable ASan #56537

Draft

14 tasks

tchaikov force-pushed the wip-ec-isa-fix-cache-collision branch 2 times, most recently from 351c1f5 to 044abd8 Compare January 13, 2026 07:28

tchaikov removed the request for review from a team January 13, 2026 11:49

tchaikov marked this pull request as draft January 13, 2026 11:49

tchaikov force-pushed the wip-ec-isa-fix-cache-collision branch from 044abd8 to 7e66871 Compare January 14, 2026 02:12

tchaikov marked this pull request as ready for review January 14, 2026 05:50

tchaikov requested a review from a team January 15, 2026 06:55

rzarzynski approved these changes Jan 26, 2026

View reviewed changes

rzarzynski requested review from aainscow and bill-scales January 26, 2026 20:59

aainscow requested a review from jamiepryde January 26, 2026 22:34

tchaikov added the needs-qa label Jan 27, 2026

bill-scales approved these changes Jan 27, 2026

View reviewed changes

aainscow approved these changes Jan 29, 2026

View reviewed changes

SrinivasaBharath added the wip-bharath9-testing label Feb 2, 2026

NitzanMordhai merged commit b549669 into ceph:main Feb 15, 2026
16 checks passed

tchaikov deleted the wip-ec-isa-fix-cache-collision branch February 15, 2026 15:23

SrinivasaBharath removed the wip-bharath9-testing label Feb 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

erasure-code/isa: fix cache collision causing buffer overflow#66894

erasure-code/isa: fix cache collision causing buffer overflow#66894
NitzanMordhai merged 1 commit intoceph:mainfrom
tchaikov:wip-ec-isa-fix-cache-collision

tchaikov commented Jan 13, 2026 •

edited

Loading

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 15, 2026

Uh oh!

rzarzynski Jan 26, 2026

Uh oh!

tchaikov Jan 27, 2026

Uh oh!

rzarzynski Jan 27, 2026

Uh oh!

tchaikov Feb 4, 2026

Uh oh!

jamiepryde commented Feb 3, 2026

Uh oh!

NitzanMordhai commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

tchaikov commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem:

AddressSanitizer Report:

Root Cause:

Solution:

Test Fix:

Production Impact:

Why Users Haven't Complained:

Contribution Guidelines

Checklist

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 14, 2026

Uh oh!

tchaikov commented Jan 15, 2026

Uh oh!

rzarzynski Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

tchaikov Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

rzarzynski Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

tchaikov Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

jamiepryde commented Feb 3, 2026

Uh oh!

NitzanMordhai commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

tchaikov commented Jan 13, 2026 •

edited

Loading