Skip to content

core-clp: Add CLI command to extract a compressed file as IR.#420

Merged
haiqi96 merged 75 commits into
y-scope:mainfrom
haiqi96:ArchiveToIRCmd
Jun 12, 2024
Merged

core-clp: Add CLI command to extract a compressed file as IR.#420
haiqi96 merged 75 commits into
y-scope:mainfrom
haiqi96:ArchiveToIRCmd

Conversation

@haiqi96

@haiqi96 haiqi96 commented May 30, 2024

Copy link
Copy Markdown
Contributor

References

based on #417

Description

This changes adds ir decompression execution path to the clp executable.

The PR contains two notable changes:

  1. The PR introduce a new command clp i. The command allows user to decompress a file split to one or multiple IR files, by providing the orig_file_id and a message index. It also let user pick a custom threshold for the uncompressed IR size and a directory to temporarily write IRs to.
  2. Since the message_index and the orig_file_id can unique identiy a file split, we implemented a simplified decompression logic in IrDecompression.cpp. Compared to the decompression.cpp,

Validation performed

To validate the functionality, we compressed a 64MB file into archive(s). We then decompressed it into mulitple IRs, decoded and concatnate them, and did a binary comparison with the original file.

We used two configuration to cover all the possible cases:

  1. Compressed a 64MB hadoop log using smaller encoded file size and archive size, such that it splits the original file into 3 splits across 2 archives. We then decompressed all 3 IRs by running clp 3 times, using different message index

  2. Compressed the 64MB hadoop log using default settings, so only one file and archive was generated. We then decompressed the IR using a 32MB threshold, generating 3 IRs on disk.

@kirkrodrigues kirkrodrigues marked this pull request as ready for review June 10, 2024 14:27
@kirkrodrigues kirkrodrigues self-requested a review June 10, 2024 21:52
Comment thread components/core/src/clp/clp/IrDecompression.hpp Outdated
Comment thread components/core/src/clp/clp/CommandLineArguments.hpp Outdated
Comment thread components/core/src/clp/clp/CommandLineArguments.hpp Outdated
Comment thread components/core/src/clp/clp/CommandLineArguments.cpp Outdated
Comment thread components/core/src/clp/clp/run.cpp Outdated
Comment thread components/core/src/clp/clp/run.cpp Outdated
Comment thread components/core/src/clp/clp/run.cpp Outdated
Comment thread components/core/src/clp/clp/run.cpp Outdated
Comment thread components/core/src/clp/clp/FileDecompressor.inc Outdated
Comment thread components/core/src/clp/clp/FileDecompressor.hpp Outdated
@haiqi96 haiqi96 requested a review from kirkrodrigues June 11, 2024 21:59
Comment thread components/core/src/clp/clp/CommandLineArguments.hpp Outdated
Comment thread components/core/src/clp/clp/run.cpp Outdated
Comment thread components/core/src/clp/clp/run.cpp
Comment thread components/core/src/clp/clp/FileDecompressor.hpp
Comment thread components/core/src/clp/clp/decompression.hpp Outdated
Comment thread components/core/src/clp/clp/decompression.cpp Outdated
Comment thread components/core/src/clp/clp/CommandLineArguments.cpp Outdated
Comment thread components/core/src/clp/clp/CommandLineArguments.cpp Outdated
haiqi96 and others added 4 commits June 11, 2024 21:09

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

core-clp: Add CLI command to extract a file from an archive as IR.

@haiqi96

haiqi96 commented Jun 12, 2024

Copy link
Copy Markdown
Contributor Author

For the PR title, how about:

core-clp: Add CLI command to extract a file from an archive as IR.

how about core-clp: Add CLI command to extract a compressed file as IR.

An archive gives me the impression that user needs to specifiy an archive.

@kirkrodrigues

Copy link
Copy Markdown
Member

For the PR title, how about:
core-clp: Add CLI command to extract a file from an archive as IR.

how about core-clp: Add CLI command to extract a compressed file as IR.

sgtm

@haiqi96 haiqi96 changed the title core-clp: add Archive to IR decompression as a command line option for clp core-clp: Add CLI command to extract a compressed file as IR. Jun 12, 2024
@haiqi96 haiqi96 merged commit d5fcd6b into y-scope:main Jun 12, 2024
@haiqi96 haiqi96 deleted the ArchiveToIRCmd branch June 28, 2024 14:43
jackluo923 pushed a commit to jackluo923/clp that referenced this pull request Dec 4, 2024
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants