Bug
If a compression command fails due to invalid paths being given in the compression command, the dataset will still be entered into the metadata database. This creates inaccurate behaviour when using, for example, dataset-manager utilities.
CLP version
v0.10.0
Environment
Ubuntu jammy
Reproduction steps
- Start
clp-json and compress twice: once with a path that does exist, and once with a path that doesn’t exist.
- Run
dataset-manager.sh list; you’ll see both of the datasets that you specified, even though the compression for the second one
- Run
ls in clp-package/var/data/archives, and you will see that the only directory present there is the successful dataset.
- Run
dataset-manager.sh del --all and you’ll see both of the datasets being deleted from the archives and the metadata database.
Sample output:
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package$ cd /home/quinnmitchell/clp/build/clp-package \
&& ./sbin/compress.sh \
--timestamp-key 'timestamp' \
--dataset 'invalid' \
/home/quinnmitchell/invalid
Container clp-package-clp-runtime-run-5e60ef3d5841 Creating
Container clp-package-clp-runtime-run-5e60ef3d5841 Created
2026-03-10T16:55:16.946 INFO [compress] Compression job 8 submitted.
2026-03-10T16:55:17.449 ERROR [compress] Compression failed. At least one of your input paths could not be processed. See the error log at 'user/job_8_failed_paths.txt' inside your configured logs directory (`logs_directory`) for more details.
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package$ cd /home/quinnmitchell/clp/build/clp-package \
&& ./sbin/compress.sh \
--timestamp-key 'timestamp' \
--dataset 'real' \
/home/quinnmitchell/clp/integration-tests/tests/data/json_multifile/logs
Container clp-package-clp-runtime-run-2f7351001714 Creating
Container clp-package-clp-runtime-run-2f7351001714 Created
2026-03-10T16:55:42.001 INFO [compress] Compression job 9 submitted.
2026-03-10T16:55:43.007 INFO [compress] Compressed 8.90KB into 3.02KB (2.95x). Speed: 16.77KB/s.
2026-03-10T16:55:43.509 INFO [compress] Compression finished.
2026-03-10T16:55:43.509 INFO [compress] Compressed 8.90KB into 3.02KB (2.95x). Speed: 15.32KB/s.
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package$ cd /home/quinnmitchell/clp/build/clp-package \
&& ./sbin/admin-tools/dataset-manager.sh list
Container clp-package-clp-runtime-run-a0f884875a5e Creating
Container clp-package-clp-runtime-run-a0f884875a5e Created
2026-03-10T16:55:52.274 INFO [dataset_manager] Found 2 datasets.
2026-03-10T16:55:52.275 INFO [dataset_manager] invalid
2026-03-10T16:55:52.275 INFO [dataset_manager] real
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package$ cd /home/quinnmitchell/clp/build/clp-package/var/data/archives
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package/var/data/archives$ ls
real
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package/var/data/archives$ cd /home/quinnmitchell/clp/build/clp-package
quinnmitchell@baker21:/home/quinnmitchell/clp/build/clp-package$ ./sbin/admin-tools/dataset-manager.sh del --all
Container clp-package-clp-runtime-run-fdf79ab1b8d4 Creating
Container clp-package-clp-runtime-run-fdf79ab1b8d4 Created
2026-03-10T17:05:09.261 INFO [dataset_manager] Deleted archives of dataset `invalid`.
2026-03-10T17:05:09.512 INFO [dataset_manager] Deleted dataset `invalid` from the metadata database.
2026-03-10T17:05:09.513 INFO [dataset_manager] Deleted archives of dataset `real`.
2026-03-10T17:05:09.787 INFO [dataset_manager] Deleted dataset `real` from the metadata database.
Bug
If a compression command fails due to invalid paths being given in the compression command, the dataset will still be entered into the metadata database. This creates inaccurate behaviour when using, for example,
dataset-managerutilities.CLP version
v0.10.0
Environment
Ubuntu jammy
Reproduction steps
clp-jsonand compress twice: once with a path that does exist, and once with a path that doesn’t exist.dataset-manager.sh list; you’ll see both of the datasets that you specified, even though the compression for the second onelsinclp-package/var/data/archives, and you will see that the only directory present there is the successful dataset.dataset-manager.sh del --alland you’ll see both of the datasets being deleted from the archives and the metadata database.Sample output: