fix(clp-package): Update stale validate_dataset_exists references and add default dataset fallback in sbin scripts (fixes #2059).#2060
Conversation
…d add default dataset fallback (fixes y-scope#2059).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (2)
WalkthroughThe changes replace a single-dataset validation function with a multi-dataset variant across two files. The function rename from Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
validate_dataset_exists references and add default dataset fallback (fixes #2059).validate_dataset_exists references and add default dataset fallback (fixes #2059).
validate_dataset_exists references and add default dataset fallback (fixes #2059).validate_dataset_exists references and add default dataset fallback in sbin scripts (fixes #2059).
…nd add default dataset fallback in `sbin` scripts (fixes y-scope#2059). (y-scope#2060)
Description
#1992 renamed
validate_dataset_existstovalidate_datasets_exist(singular to plural)in
native/utils.pyand updated its signature to acceptlist[str], but two callers werenot updated:
native/decompress.py— import (line 39) and call site (line 151)native/archive_manager.py— import (line 31) and call site (line 203)Since these are module-level imports, the
ImportErrorprevents the entire decompressmodule from loading, making all
decompress.shsubcommands (x,i,j)non-functional.
This PR:
validate_datasets_exist, wrapping the singledataset string in a list to match the new signature.
native/decompress.pyfor thej(extract-json)subcommand, matching the pattern used by every other script (
compress.py,decompress.py(non-native),search.py,archive_manager.py):native/decompress.pyerrored when--datasetwas omitted, unlike allother scripts which fall back to the
"default"dataset.Checklist
breaking change.
Validation performed
Tested on the built package (
taskto build, thensbin/start-clp.sh) with bothclp-JSON (clp-s) and clp-TEXT (clp) storage engines.
Part A: clp-JSON (clp-s engine)
Setup
Task: Build package, start CLP, and compress test data into both the default and a
named dataset.
Commands:
Output:
Archive IDs:
Scenario 1:
extract-jsonwithout--dataset(default dataset fallback)Task: Verify that
jwithout--datasetfalls back to the"default"datasetinstead of erroring with "Dataset unspecified".
Command:
Output:
Explanation: Before this fix, this command would fail with an
ImportError. Even afterfixing the import, the old code would reject this call with "Dataset unspecified, but must
be specified for command
j". Now it correctly falls back to the"default"dataset,matching the behavior of all other scripts.
Scenario 2:
extract-jsonwith--datasetTask: Verify
jwith an explicit dataset filter works.Command:
Output:
Scenario 3:
extract-jsonwith--target-chunk-sizeTask: Verify
jwith a custom chunk size works.Command:
Output:
Scenario 4:
extract-irby--orig-file-idTask: Verify the
isubcommand loads withoutImportErrorwhen using--orig-file-id.Command:
Output:
Explanation: The error is expected — IR extraction is only supported for the
clpstorage engine, and the test data was compressed with
clp-s. The important thing is thatthe module loaded successfully (no
ImportError).Scenario 5:
extract-irby--orig-file-pathTask: Verify the
isubcommand loads withoutImportErrorwhen using--orig-file-path.Command:
Output:
Scenario 6:
extract-irwith--target-uncompressed-sizeTask: Verify the
isubcommand loads with the optional--target-uncompressed-sizeargument.
Command:
Output:
Scenario 7:
--orig-file-idand--orig-file-pathare mutually exclusiveTask: Verify that providing both
--orig-file-idand--orig-file-pathis rejectedby argparse.
Command:
Output:
Explanation: argparse correctly enforces the mutually exclusive group defined in
decompress.py.Scenario 8:
extract-file(x) loads without ImportErrorTask: Verify the
xsubcommand is not broken by the import issue.Command:
Output:
Explanation: The error is expected for
clp-sdata. The module loaded successfullywithout
ImportError.Part B: clp-TEXT (clp engine)
Switched to the
clp(text) storage engine to test full end-to-end decompression,including
extract-fileandextract-irwhich are only supported by theclpengine.Setup (clp-TEXT)
Task: Stop CLP-S, clean data, start with text config, and compress hive-24hr text
logs.
Commands:
Output:
File info (from DB):
Scenario 9:
extract-fileby path (clp-TEXT)Task: Verify file extraction works with the
clpengine.Command:
Output: Command completed successfully (exit code 0, no output).
Scenario 10:
extract-filewith--extraction-dir(clp-TEXT)Task: Verify file extraction to a custom directory.
Command:
Verification:
Explanation: File was extracted to the specified directory with the original path
structure preserved.
Scenario 11:
extract-irby--orig-file-id(clp-TEXT)Task: Verify IR extraction works end-to-end with a real file ID.
Command:
Output:
Scenario 12:
extract-irby--orig-file-path(clp-TEXT)Task: Verify IR extraction works with a file path lookup.
Command:
Output:
Scenario 13:
extract-irwith--target-uncompressed-size(clp-TEXT)Task: Verify IR extraction with custom target uncompressed size.
Command:
Output:
Scenario 14:
--orig-file-idand--orig-file-pathmutually exclusive (clp-TEXT)Task: Confirm mutual exclusivity is enforced under the
clpengine as well.Command:
Output:
Summary by CodeRabbit
Release Notes