Bug
The compression_jobs.clp_config, compression_tasks.clp_paths_to_compress, and query_jobs.job_config columns are defined as VARBINARY(60000) in
initialize-orchestration-db.py
. When a brotli-compressed msgpack config exceeds 60,000 bytes, MySQL/MariaDB silently truncates the data on INSERT. The truncated blob is a valid-length binary value but an incomplete brotli stream, causing brotli.decompress() to fail with brotli.error: decoder failed.
###Proposed Fix
-
Change column types from VARBINARY(60000) to MEDIUMBLOB (16MB max) in initialize-orchestration-db.py. MEDIUMBLOB adds only 1 byte of storage overhead per row (3-byte length prefix vs 2-byte) and is compatible with all existing application code — Python mysql-connector, Node.js mysql2, and Rust sqlx all handle blob types identically to varbinary.
-
Add error handling in the compression scheduler's job processing loop so a single corrupted job doesn't crash the entire scheduler. The bad job should be marked as FAILED and skipped:
for job_row in jobs:
job_id = job_row["id"]
try:
clp_io_config = ClpIoConfig.model_validate(
msgpack.unpackb(brotli.decompress(job_row["clp_config"]))
)
except Exception:
logger.error("Failed to decode clp_config for job %s, marking as failed.", job_id)
# mark job as FAILED and continue
continue
- Same resilience fix in the WebUI's mapCompressionMetadataRows — skip rows with corrupted blobs instead of crashing the entire endpoint.
CLP version
0.10.0
Environment
eks ( should occur in any env)
Reproduction steps
This occurs in high-throughput S3 ingestion scenarios where the log-ingestor's Buffer accumulates many S3 object metadata IDs before flushing. With buffer_flush_threshold set to 4GB (default) and many small S3 objects (1-10MB each), a single compression job can reference thousands of object IDs. The resulting ClpIoConfig serialized as msgpack and compressed with brotli exceeds 60,000 bytes.
In our environment (4 TB/hr ingestion, ~100 compression workers), we observed 24 out of ~1,700 pending jobs with exactly 60,000-byte clp_config blobs — all failing to decompress. Valid jobs in the same table had configs ranging from 2,000-5,500 bytes.
Bug
The compression_jobs.clp_config, compression_tasks.clp_paths_to_compress, and query_jobs.job_config columns are defined as VARBINARY(60000) in
initialize-orchestration-db.py
. When a brotli-compressed msgpack config exceeds 60,000 bytes, MySQL/MariaDB silently truncates the data on INSERT. The truncated blob is a valid-length binary value but an incomplete brotli stream, causing brotli.decompress() to fail with brotli.error: decoder failed.
###Proposed Fix
Change column types from VARBINARY(60000) to MEDIUMBLOB (16MB max) in initialize-orchestration-db.py. MEDIUMBLOB adds only 1 byte of storage overhead per row (3-byte length prefix vs 2-byte) and is compatible with all existing application code — Python mysql-connector, Node.js mysql2, and Rust sqlx all handle blob types identically to varbinary.
Add error handling in the compression scheduler's job processing loop so a single corrupted job doesn't crash the entire scheduler. The bad job should be marked as FAILED and skipped:
for job_row in jobs:
job_id = job_row["id"]
try:
clp_io_config = ClpIoConfig.model_validate(
msgpack.unpackb(brotli.decompress(job_row["clp_config"]))
)
except Exception:
logger.error("Failed to decode clp_config for job %s, marking as failed.", job_id)
# mark job as FAILED and continue
continue
CLP version
0.10.0
Environment
eks ( should occur in any env)
Reproduction steps
This occurs in high-throughput S3 ingestion scenarios where the log-ingestor's Buffer accumulates many S3 object metadata IDs before flushing. With buffer_flush_threshold set to 4GB (default) and many small S3 objects (1-10MB each), a single compression job can reference thousands of object IDs. The resulting ClpIoConfig serialized as msgpack and compressed with brotli exceeds 60,000 bytes.
In our environment (4 TB/hr ingestion, ~100 compression workers), we observed 24 out of ~1,700 pending jobs with exactly 60,000-byte clp_config blobs — all failing to decompress. Valid jobs in the same table had configs ranging from 2,000-5,500 bytes.