Skip to content

clp-s: Correctly report uncompressed size of archives during archive-splitting (fixes #469).#463

Merged
gibber9809 merged 1 commit into
y-scope:mainfrom
gibber9809:archive-size-fix
Jul 3, 2024
Merged

clp-s: Correctly report uncompressed size of archives during archive-splitting (fixes #469).#463
gibber9809 merged 1 commit into
y-scope:mainfrom
gibber9809:archive-size-fix

Conversation

@gibber9809

Copy link
Copy Markdown
Contributor

Description

This PR fixes a bug where if an archive is split while halfway through parsing a buffer of JSON objects the entire buffer is attributed to uncompressed size of the first archive instead of being split correctly between the archives before and after the split. We solve this problem by adding a new function to JsonFileIterator which reports the total number of bytes consumed by the caller (as opposed to the total number of bytes read by JsonFileIterator which is what we used before).

Validation performed

  • Validated that archives are correctly attributed the right proportion of a buffer of JSON during archive splitting
  • Validated that the sum of uncompressed size of all archives is equal to the total file size

@wraymo wraymo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to create an issue for it? And can we use an affirmative title like "Correctly report..."?

@gibber9809 gibber9809 changed the title clp-s: Fix bug where reported uncompressed size for an archive can be incorrect clp-s: Correctly report uncompressed size of archives during archive-splitting (fixes #469). Jul 3, 2024
@gibber9809 gibber9809 merged commit 3c1f0ad into y-scope:main Jul 3, 2024
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
@gibber9809 gibber9809 deleted the archive-size-fix branch January 29, 2025 15:51
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

clp-s: Reported uncompressed size can be incorrect after archive-splitting

2 participants