Skip to content

GH-45073: [C++][Parquet] Fix generation of repetition levels for encryption test data#45074

Merged
pitrou merged 2 commits intoapache:mainfrom
adamreeve:encryption-repetition-fix
Jan 6, 2025
Merged

GH-45073: [C++][Parquet] Fix generation of repetition levels for encryption test data#45074
pitrou merged 2 commits intoapache:mainfrom
adamreeve:encryption-repetition-fix

Conversation

@adamreeve
Copy link
Copy Markdown
Contributor

@adamreeve adamreeve commented Dec 19, 2024

Rationale for this change

This makes the test data readable by other Parquet implementations that validate the repetition levels.

What changes are included in this PR?

  • Corrects the generation of encryption test files so that the int64 list columns correctly start lists with repetition level 0.
  • Updates the parquet-testing submodule to use the corrected files.

Are these changes tested?

Yes, covered by existing tests.

Are there any user-facing changes?

No

@adamreeve
Copy link
Copy Markdown
Contributor Author

The tests will fail until the test data is first fixed in apache/parquet-testing#65 and the submodule updated here.

@adamreeve adamreeve force-pushed the encryption-repetition-fix branch from b326a78 to dbde2b0 Compare January 3, 2025 02:58
@adamreeve adamreeve marked this pull request as ready for review January 3, 2025 03:10
@adamreeve adamreeve requested a review from wgtmac as a code owner January 3, 2025 03:10
Copy link
Copy Markdown
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jan 3, 2025
@adamreeve
Copy link
Copy Markdown
Contributor Author

The integration build error looks like #45128, the error message is different but is the same as the current failure on the main branch:

/arrow/ci/scripts/integration_arrow_build.sh: line 62: /arrow/java/ci/scripts/java_jni_build.sh: No such file or directory

@pitrou pitrou force-pushed the encryption-repetition-fix branch from dbde2b0 to 17dd3c2 Compare January 6, 2025 11:02
Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thanks for this @adamreeve

@pitrou pitrou merged commit a931aff into apache:main Jan 6, 2025
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Jan 6, 2025
@adamreeve adamreeve deleted the encryption-repetition-fix branch January 6, 2025 20:22
@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit a931aff.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them.

zeroshade added a commit to apache/arrow-go that referenced this pull request Mar 12, 2025
As a result of apache/arrow#45073, the test
files in the parquet-testing repo were updated
(apache/parquet-testing#65).

This led to having to also update the corresponding tests here too
similar to apache/arrow#45074
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants