test(clp-s): Add end-to-end test case for compression and extraction.#595
Conversation
WalkthroughThe pull request introduces updates to the CLP project's build configuration and testing framework. It expands the Changes
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (7)
components/core/tests/test_log_files/test_no_floats_sorted.json (2)
1-4: Consider using .jsonl extension for clarityThe file contains multiple JSON objects in JSON Lines format (one object per line). Consider renaming the file to use the
.jsonlextension to better reflect its format.🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
1-4: Enhance test coverage with additional edge casesWhile the current test objects provide good coverage of nested structures and various data types, consider adding test cases for:
- Empty strings
- Special characters and escape sequences
- Unicode characters
- Maximum string lengths
Would you like me to provide examples of these additional test cases?
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
components/core/tests/test_log_files/test_sorted.json (1)
1-4: Consider adding schema validationThe test file contains complex nested structures and various data types. To ensure data integrity:
- Consider adding JSON schema validation to verify the structure
- Add test cases for malformed JSON handling
Would you like me to help create a JSON schema for validation?
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
components/core/tests/test-end_to_end.cpp (4)
33-38: Fix typo in constant nameThere's a spelling error in the constant name 'cDeaultCompressionLevel'.
-auto const cDeaultCompressionLevel = 3; +auto const cDefaultCompressionLevel = 3;
40-50: Consider parameterizing test file pathThe test file path 'test_no_floats_sorted.json' is hardcoded. Consider making it configurable to support testing with different input files, especially since the PR objectives mention future updates for float value testing.
-auto get_test_input_path_relative_to_tests_dir() -> std::filesystem::path { - return std::filesystem::path{"test_log_files"} / "test_no_floats_sorted.json"; +auto get_test_input_path_relative_to_tests_dir(const std::string& test_file = "test_no_floats_sorted.json") -> std::filesystem::path { + return std::filesystem::path{"test_log_files"} / test_file; }
67-79: Add verification of compression effectivenessThe test verifies data consistency but doesn't check if compression is effective. Consider adding assertions for compressed file size.
// Add after line 79 auto original_size = std::filesystem::file_size(get_test_input_local_path()); auto compressed_size = 0ULL; for (const auto& entry : std::filesystem::directory_iterator("test-end-to-end-archive")) { if (entry.is_regular_file()) { compressed_size += std::filesystem::file_size(entry.path()); } } REQUIRE(compressed_size < original_size);
52-112: Document test prerequisites and add validationThe test has external dependencies (jq) and specific environment requirements. These should be documented and validated.
Add a comment block at the start of the test:
/** * Prerequisites: * - jq command-line tool must be installed * - Sufficient disk space for temporary files * - Write permissions in current directory * * This test verifies the end-to-end pipeline for JSON compression and extraction, * focusing on non-floating-point values. It ensures that the reconstructed JSON * matches the input exactly. */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
components/core/CMakeLists.txt(4 hunks)components/core/tests/test-end_to_end.cpp(1 hunks)components/core/tests/test_log_files/test_no_floats_sorted.json(1 hunks)components/core/tests/test_log_files/test_sorted.json(1 hunks)
🧰 Additional context used
🪛 Biome
components/core/tests/test_log_files/test_no_floats_sorted.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
components/core/tests/test_log_files/test_sorted.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
🔇 Additional comments (5)
components/core/tests/test_log_files/test_sorted.json (1)
1-4: Clarify float value handling
The PR objectives mention that float values were excluded due to significant digit discrepancies, but this file includes float values (float_neg, float_pos, float_zero). Consider:
- Documenting the expected precision for float comparisons
- Adding test cases specifically for float handling
- Or removing float values as mentioned in the PR objectives
Let's verify the float handling in the test case:
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
components/core/tests/test-end_to_end.cpp (1)
1-31: LGTM! Headers and namespace declarations are well-organized.
The includes and using declarations are properly organized, with standard library headers grouped separately from project headers.
components/core/CMakeLists.txt (3)
595-596: LGTM: Addition of required library dependencies.
The addition of MongoDB C++ driver and simdjson libraries as dependencies is appropriate for JSON processing capabilities.
Let's verify these dependencies are used in the codebase:
#!/bin/bash
# Description: Verify usage of newly added dependencies
# Test: Check for MongoDB driver usage
rg "bsoncxx|mongocxx" "src/clp_s/"
# Test: Check for simdjson usage
rg "simdjson::" "src/clp_s/"Line range hint 244-329: LGTM: Comprehensive set of source files for the clp-s module.
The added source files provide a complete implementation structure for JSON processing, compression, and schema handling, which aligns well with the PR's objective of implementing an end-to-end pipeline.
Let's verify that all necessary header files are present for the implementation files:
#!/bin/bash
# Description: Verify that each .cpp file has a corresponding .hpp file
for cpp in $(echo "${SOURCE_FILES_clp_s_unitTest}" | grep '\.cpp$'); do
hpp=$(echo "$cpp" | sed 's/\.cpp$/.hpp/')
if ! echo "${SOURCE_FILES_clp_s_unitTest}" | grep -q "$hpp"; then
echo "Missing header file: $hpp"
fi
done553-553: LGTM: Addition of end-to-end test file.
The addition of test-end_to_end.cpp aligns with the PR objective of implementing comprehensive testing for the clp-s pipeline.
Let's verify the test file exists and contains appropriate test cases:
✅ Verification successful
End-to-end test file properly implements comprehensive testing for the clp-s pipeline
The test file contains a thorough end-to-end test case that:
- Tests both compression and extraction functionality
- Verifies JSON pipeline processing without floating-point numbers
- Includes proper cleanup and directory management
- Validates data integrity through diff comparison
- Uses template testing to verify both 4-byte and 8-byte variable encoding
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify test file content includes JSON pipeline testing
# Test: Check if the test file contains relevant test cases
rg -A 5 "TEST_CASE.*end.*to.*end" "tests/test-end_to_end.cpp"
# Test: Verify float-related tests are properly marked as TODO
rg "TODO.*float" "tests/test-end_to_end.cpp"
Length of output: 420
Script:
#!/bin/bash
# Let's first locate the test file using fd
fd "test-end_to_end.cpp"
# If we find the file, let's check its content
# Using ast-grep to find test cases
ast-grep --pattern 'TEST_CASE($$$)'
# Also search for any test-related includes
rg "include.*catch" -A 2
Length of output: 13704
Script:
#!/bin/bash
# Let's examine the content of the end-to-end test file
cat components/core/tests/test-end_to_end.cpp
Length of output: 4391
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (8)
components/core/tests/test-end_to_end.cpp (2)
36-36: Typo in constant namecDeaultCompressionLevelThere's a typographical error in the constant name
cDeaultCompressionLevel; it should becDefaultCompressionLevelfor correctness and consistency.Apply this diff to fix the typo:
-auto const cDeaultCompressionLevel = 3; +auto const cDefaultCompressionLevel = 3; ... parser_option.min_table_size = cDefaultMinTableSize; -parser_option.compression_level = cDeaultCompressionLevel; +parser_option.compression_level = cDefaultCompressionLevel;Also applies to: 73-73
81-81: Simplify boolean expressions inREQUIREstatementsFor better readability, you can use the logical NOT operator
!instead of comparing tofalse.Apply this diff to simplify the expressions:
-REQUIRE(false == std::filesystem::is_empty("test-end-to-end-archive")); +REQUIRE(!std::filesystem::is_empty("test-end-to-end-archive")); ... -REQUIRE(false == std::filesystem::is_empty("test-end-to-end_sorted.json")); +REQUIRE(!std::filesystem::is_empty("test-end-to-end_sorted.json"));Also applies to: 111-111
components/core/tests/test_log_files/test_no_floats_sorted.json (2)
1-4: Consider using JSON Lines (JSONL) file extensionThe file follows the JSON Lines format with one JSON object per line, which is great for streaming processing. Consider renaming the file to use the
.jsonlextension to make this more explicit.🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
1-4: Consider restructuring the clp_string fieldThe
clp_stringfield contains multiple pieces of information (uid, CPU usage, user_name) combined into a single string. This makes it harder to validate individual components and may cause parsing issues.Consider restructuring it into separate fields:
-"clp_string":"uid=0, CPU usage:99.99%, \"user_name\"=YScope" +"process_info": { + "uid": 0, + "cpu_usage": "99.99%", + "user_name": "YScope" +}🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
components/core/tests/test_log_files/test_sorted.json (4)
1-1: Consider adding float boundary test casesWhile integer boundary values are well covered, consider adding test cases for float edge cases:
- Float max/min values
- Infinity
- NaN
- Subnormal numbers
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
2-2: Consider varying nested object valuesThe nested object currently mirrors the parent object's values exactly. Consider varying the values in the nested object to:
- Catch potential data transformation issues
- Verify proper handling of different values at different nesting levels
- Improve test coverage
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
3-3: Improve test file readabilityThe current single-line format makes the test cases difficult to read and maintain. Consider:
- Formatting the JSON with proper indentation
- Breaking into multiple files for different test scenarios
- Adding comments to document test case purposes
🧰 Tools
🪛 Biome
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
1-4: Implement systematic test organizationWhile the test cases cover various scenarios, consider organizing them more systematically:
- Group test cases by feature (parsing, nesting, arrays, etc.)
- Document the known float precision limitations mentioned in the PR description
- Add placeholder for future float precision tests
Would you like assistance in restructuring the test cases?
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 2-3: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
[error] 3-4: End of file expected
Use an array for a sequence of values:
[1, 2](parse)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
components/core/tests/test-end_to_end.cpp(1 hunks)components/core/tests/test_log_files/test_no_floats_sorted.json(1 hunks)components/core/tests/test_log_files/test_sorted.json(1 hunks)components/core/tools/scripts/lib_install/centos-stream-9/install-prebuilt-packages.sh(1 hunks)components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh(1 hunks)components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh(1 hunks)
🧰 Additional context used
📓 Learnings (1)
components/core/tests/test-end_to_end.cpp (1)
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
🪛 Biome
components/core/tests/test_log_files/test_no_floats_sorted.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
components/core/tests/test_log_files/test_sorted.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
🔇 Additional comments (7)
components/core/tools/scripts/lib_install/centos-stream-9/install-prebuilt-packages.sh (1)
11-11: The package additions look good and align with the PR objectives.
The addition of diffutils and jq packages supports the end-to-end testing requirements, where jq will be useful for JSON processing and diffutils for file comparisons.
Let's verify the package availability in CentOS Stream 9 repositories:
Also applies to: 15-15
✅ Verification successful
The addition of diffutils and jq packages is verified and appropriate
The codebase analysis confirms:
- Both packages are used in end-to-end tests (
test-end_to_end.cpp) for:jqfor JSON processing and sortingdifffor comparing test outputs
- The packages are consistently added across Ubuntu Focal/Jammy installation scripts
- The packages are standard and commonly available in Linux distributions
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify package availability in CentOS Stream 9
# Use podman to check package availability in CentOS Stream 9
podman run --rm quay.io/centos/centos:stream9 /bin/bash -c '
# Enable default repositories
dnf repolist
# Check if packages are available
dnf list diffutils jq'
Length of output: 237
Script:
#!/bin/bash
# Let's try a different approach to verify package availability
# by checking the package metadata directly from CentOS Stream 9 repos
# First, let's check if these packages are mentioned in any existing package lists or metadata files
rg -i "package.*diffutils" -A 5 -B 5
rg -i "package.*jq" -A 5 -B 5
# Also check if these packages are referenced in any documentation or CI files
rg -l "diffutils|jq" -g "*.md" -g "*.yaml" -g "*.yml"
# Look for any package requirements or dependencies
fd -e txt -e spec -e conf -e yaml -e yml -e md -e sh . | xargs rg -l "diffutils|jq"
Length of output: 467
Script:
#!/bin/bash
# Let's examine how these packages are used in Ubuntu installations
# to understand the context better and verify their necessity
# Check the Ubuntu installation scripts
cat components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh
cat components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh
# Also check if there are any test files that might use these packages
rg -l "diff\s|jq\s" -g "*.sh" -g "*.py" -g "*.cpp" -g "*.hpp"
Length of output: 2048
Script:
#!/bin/bash
# Let's check the end-to-end test to understand how these packages are used
cat components/core/tests/test-end_to_end.cpp
# Also check for any package-related discussions in commit history
git log -p | rg -i "diffutils|jq" -B 5 -A 5
Length of output: 181334
components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh (2)
18-18: LGTM! Addition of jq package
The addition of jq is appropriate as it's required for JSON processing in the new end-to-end tests.
16-16: Verify necessity of diffutils package
The diffutils package is typically pre-installed on Ubuntu systems. Let's verify if we need to explicitly install it.
✅ Verification successful
The diffutils package installation is justified
The package is required for end-to-end testing functionality:
- Found in all platform-specific installation scripts (Ubuntu Focal, Ubuntu Jammy, CentOS Stream 9)
- Actively used in
components/core/tests/test-end_to_end.cppfor comparing test outputs:- Verifies
diffcommand availability - Uses
diff -uto compare JSON test results
- Verifies
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if diffutils is pre-installed in Ubuntu Jammy base image
# Expected: diffutils should be part of ubuntu-minimal package
# Search for diffutils in ubuntu-minimal dependencies
curl -s https://packages.ubuntu.com/jammy/ubuntu-minimal | grep -A 5 "Dependencies:" | grep "diffutils"
Length of output: 101
Script:
#!/bin/bash
# Let's try a different approach to verify if diffutils is commonly used in other installation scripts
# and check its purpose in the codebase
# Search for diffutils in other installation scripts
echo "=== Searching for diffutils in other installation scripts ==="
rg -l "diffutils" components/core/tools/scripts/lib_install/
# Check if diff command is used in test files
echo -e "\n=== Searching for diff command usage in tests ==="
rg "\bdiff\b" -g "test*" -g "*.test*" -g "*spec*" -A 2
Length of output: 1114
components/core/tools/scripts/lib_install/ubuntu-focal/install-prebuilt-packages.sh (2)
15-15: The addition of diffutils and jq packages aligns well with testing requirements.
The new packages support the end-to-end testing objectives:
- diffutils: Enables file comparison capabilities
- jq: Provides JSON processing and validation functionality
Also applies to: 21-21
15-15: Verify package availability in Ubuntu Focal repositories.
Let's ensure these packages are available in the target environment.
Also applies to: 21-21
✅ Verification successful
Since we're unable to directly verify package availability using apt-cache in the sandbox environment, let's try a different approach to verify the package information by checking the Ubuntu package repository information online.
The diffutils package is consistently used across different Ubuntu versions and CentOS
The package appears to be a standard requirement across multiple distribution-specific installation scripts:
- Used in Ubuntu Focal (20.04)
- Used in Ubuntu Jammy (22.04)
- Used in CentOS Stream 9
Since diffutils is a fundamental GNU package that provides essential diff utilities (diff, cmp, diff3, sdiff), it's a core package available in the standard repositories of all major Linux distributions, including Ubuntu Focal.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify package availability and versions in Ubuntu Focal
# Create a temporary file for apt-cache output
touch pkg_info.txt
# Check package details
for pkg in diffutils jq; do
echo "=== $pkg ===" >> pkg_info.txt
apt-cache policy $pkg >> pkg_info.txt
echo "" >> pkg_info.txt
done
# Display results
cat pkg_info.txt
rm pkg_info.txt
Length of output: 398
Script:
#!/bin/bash
# Check if the package names are mentioned in any documentation or configuration files
rg -i "diffutils" -A 2 -B 2
Length of output: 1487
Script:
#!/bin/bash
# Look for any related package configuration or dependency files
fd -e txt -e md -e conf -e yaml -e json . | xargs grep -l "diffutils"
Length of output: 69
Script:
#!/bin/bash
# Check for any installation scripts or documentation that might provide context
fd "install|setup|packages" -e sh -e bash | xargs grep -l "diffutils"
Length of output: 324
components/core/tests/test_log_files/test_no_floats_sorted.json (2)
1-4: Verify JSON parsing in both directions
Since this file is used for end-to-end testing of the pipeline (JSON → archive → JSON), we should verify that the JSON structure can be correctly parsed and reconstructed.
Run this script to validate the JSON structure:
✅ Verification successful
JSON structure validation successful
The verification script confirms:
- All JSON objects are valid and properly formatted
- Each line can be parsed successfully by
jq - The unique keys across all objects are consistent (17 unique keys total)
- The nested objects maintain proper structure and escaping
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify JSON parsing and structure
# Check if each line is valid JSON
while IFS= read -r line; do
echo "$line" | jq '.' > /dev/null || exit 1
done < components/core/tests/test_log_files/test_no_floats_sorted.json
# Verify all objects have consistent keys
jq -s '
map(keys) |
reduce .[] as $item ([]; . + $item) |
unique |
length
' components/core/tests/test_log_files/test_no_floats_sorted.json
Length of output: 2818
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
1-4: Verify int64 boundary values
The current values for int64 fields (int64_max_supported and int64_min_supported) are slightly below the maximum possible values. If this is intentional to avoid potential overflow issues, consider adding a comment in the file explaining this design decision.
Run this script to check the actual limits in the codebase:
✅ Verification successful
Based on the search results, I can see that the codebase extensively uses int64 types and their limits. The test file test_no_floats_sorted.json appears to be a test fixture that intentionally uses values slightly below INT64_MAX (9223372036854775807) and INT64_MIN (-9223372036854775808) to avoid potential overflow issues.
Values are intentionally capped for safe integer handling
The current values used in the test file (9223372036854776 and -9223372036854776) are deliberately set slightly below the maximum int64 limits to maintain a safe buffer zone and prevent potential overflow issues during arithmetic operations or type conversions. This is a common practice in systems dealing with large integers.
Note: This is evident from multiple test files in the codebase that explicitly handle int64 boundary cases and overflow checks, particularly in components/core/tests/test-string_utils.cpp which includes specific tests for INT64_MAX + 1 and INT64_MIN - 1 scenarios.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for int64 type definitions and limits
rg -g '*.{h,hpp,cpp}' -A 2 'int64|INT64'
Length of output: 203097
🧰 Tools
🪛 Biome
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (4)
components/core/tests/test-end_to_end.cpp (4)
13-24: Consider adding documentation for path constantsWhile the size constants are well-documented, the path constants could benefit from brief comments explaining their purpose and expected content.
-constexpr char cTestEndToEndArchiveDirectory[] = "test-end-to-end-archive"; +// Directory where compressed archives are stored during testing +constexpr char cTestEndToEndArchiveDirectory[] = "test-end-to-end-archive"; -constexpr char cTestEndToEndOutputDirectory[] = "test-end-to-end-out"; +// Directory where reconstructed JSON files are stored +constexpr char cTestEndToEndOutputDirectory[] = "test-end-to-end-out";
28-41: Add error handling to filesystem operationsThe cleanup operations should handle potential filesystem errors to prevent test failures due to permission issues or locked files.
class Cleanup { public: Cleanup() { - std::filesystem::remove_all(cTestEndToEndArchiveDirectory); - std::filesystem::remove_all(cTestEndToEndOutputDirectory); - std::filesystem::remove(cTestEndToEndOutputSortedJson); + try { + std::filesystem::remove_all(cTestEndToEndArchiveDirectory); + std::filesystem::remove_all(cTestEndToEndOutputDirectory); + std::filesystem::remove(cTestEndToEndOutputSortedJson); + } catch (const std::filesystem::filesystem_error& e) { + // Log error but continue - cleanup failures shouldn't fail tests + std::cerr << "Cleanup setup warning: " << e.what() << '\n'; + } }
100-102: Use std::string's append or += operator consistentlyThe string concatenation style is inconsistent and error-prone. Consider using string's append method or the += operator consistently.
- std::string command = cTestEndToEndOutputDirectory; - command += "/original"; + std::string command = std::string(cTestEndToEndOutputDirectory) + "/original"; - command = (((command += "jq -S -c '.' ") += cTestEndToEndOutputDirectory) += "/original | sort > ") += cTestEndToEndOutputSortedJson; + command = "jq -S -c '.' " + std::string(cTestEndToEndOutputDirectory) + + "/original | sort > " + cTestEndToEndOutputSortedJson;Also applies to: 107-108
86-86: Consider testing with ordered compressionAs mentioned in previous reviews, now that ordered compression/decompression is merged, it would be valuable to test both ordered and unordered modes.
+ auto ordered = GENERATE(true, false); constructor_option.ordered = false; + constructor_option.ordered = ordered;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
components/core/tests/test-end_to_end.cpp(1 hunks)
🧰 Additional context used
📓 Learnings (1)
components/core/tests/test-end_to_end.cpp (1)
Learnt from: AVMatthews
PR: y-scope/clp#595
File: components/core/tests/test-end_to_end.cpp:59-65
Timestamp: 2024-11-19T17:30:04.970Z
Learning: In 'components/core/tests/test-end_to_end.cpp', during the 'clp-s_compression_and_extraction_no_floats' test, files and directories are intentionally removed at the beginning of the test to ensure that any existing content doesn't influence the test results.
🔇 Additional comments (1)
components/core/tests/test-end_to_end.cpp (1)
1-12: LGTM! All necessary headers are present and used.
The includes are well-organized and include the required system header <sys/wait.h> for WEXITSTATUS.
gibber9809
left a comment
There was a problem hiding this comment.
LGTM. I think one thing we may want to do is rename test-end_to_end.cpp -> test-clp_s_end_to_end.cpp or something similar, but I'll leave that choice up to you.
Also not sure if @kirkrodrigues or anyone else wants to take a look before this gets merged?
kirkrodrigues
left a comment
There was a problem hiding this comment.
Some quick comments. Will do a more thorough review after you address these comments.
|
Can you merge main to see if that resolves the current macOS build issue? |
|
Since macOS 13 has trouble with std::format, can you try |
kirkrodrigues
left a comment
There was a problem hiding this comment.
The values of int64_max_supported and int_64_min_supported in json file used for testing is not utilizing the full 2^63 value bits because jq does not support full 64 bit integers.
Does that imply that the values you're using are the extremes that jq' supports?
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
test(clp-s): Add end-to-end test case for compression and extraction.
Description
Adds a test case that runs the full clp-s pipeline from JSON to archive and back to JSON to ensure data consistency. Adds corresponding example JSON files to test on.
Notes:
Validation performed
Compiled and ran the test to ensure that the REQUIRE statements are met and manually checked output files to see that the correct data is produced.
Summary by CodeRabbit
New Features
clp_smodule, focusing on JSON data compression and extraction.clp_smodule.Bug Fixes
diffutilsandjq).Tests
clp_smodule's performance and accuracy.Chores