GH-39444: [C++][Parquet] Fix crash in Modular Encryption#39623
GH-39444: [C++][Parquet] Fix crash in Modular Encryption#39623pitrou merged 2 commits intoapache:mainfrom
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or In the case of PARQUET issues on JIRA the title also supports: See also: |
|
|
|
Looks like I am still getting a segmentation fault randomly in InternalFileDecryptor when WipeOut is called. Ill look into seeing if I can add a mutex around all_decryptors_ |
|
It looks like all but the C++ / AMD64 Windows 2019 C++17 AVX2 build was completed without issue. I am not sure what the issue is with the C++ / AMD64 Windows 2019 C++17 AVX2, build but it seems unrelated. |
wgtmac
left a comment
There was a problem hiding this comment.
Thanks for the thorough investigation and quick fix!
|
The C++ side looks good to me. I have also refactored the C++ tests. Could you take a look as well? @jorisvandenbossche @westonpace @pitrou |
|
@adamreeve @westonpace Do you want to take another pass? |
westonpace
left a comment
There was a problem hiding this comment.
@wgtmac no additional concerns on my end
|
@github-actions crossbow submit -g python -g cpp |
There was a problem hiding this comment.
I'm not sure this will work on builds where Parquet is disabled. Let's watch for CI.
There was a problem hiding this comment.
We might not have an actual test build without parquet .. (but indeed in general we always import pyarrow.parquet within a try/except to let this run on versions without parquet support as well)
There was a problem hiding this comment.
I was just deducting that from the fact we got no failures in the builds you triggered.
But looking at our tasks.yml, it seems the python minimal builds are not included in the "python" group ...
There was a problem hiding this comment.
Testing and potentially fixing in #40505 (and at once adding the minimal tests to the python group)
This comment was marked as outdated.
This comment was marked as outdated.
|
Question: the caching was probably there for a reason, is it ok to remove it? Would it be beneficial to reimplement it in a thread-safe way? (the PR description mentions network requests) |
|
Hmm, I was assuming the Cython compatibility issues were fixed? @raulcd |
|
Ah, that's because this PR needs rebasing. @tolleybot Can you please rebase/merge on git main? |
…void issues in multithreaded environment. Removed unsed column_metatdata_map_ and column_data_map_ class attributes Updates to include new test for python and c++ that test a large number of rows used for encryption and decryption of a dataset adding header to unit test adding a mutex to guard EVP_CIPHER_CTX in AesDecryptorImpl update for formatting removed warning concerning type conversion from double to int64 adding a mutex to guard EVP_CIPHER_CTX in AesDecryptorImpl. My previous commit actually did the same for AesEncryptorImpl Moved mutex to InternalFileDecryptor for protection of :vector<std::weak_ptr<encryption::AesDecryptor>> all_decryptors_. This seems to have fixed the issue Update cpp/src/parquet/encryption/internal_file_decryptor.cc Co-authored-by: Gang Wu <ustcwg@gmail.com> removed unneeded #include <mutex> This commit includes the following changes to the InternalFileDecryptor class: - Moved the `mutex_` declaration closer to `all_decryptors_`. This improves the readability of the code and makes it clear that `mutex_` is primarily used for protecting the `all_decryptors_` vector. - Made `mutex_` mutable. This allows the mutex to be locked in `const` member functions - Added a `GUARDED_BY(mutex_)` comment to `all_decryptors_`. This explicitly documents that access to `all_decryptors_` is protected by `mutex_` Removed lock in InternalFileDecryptor::WipeOutDecryptionKey(). Made mutex mutable. Update python/pyarrow/tests/test_dataset_encryption.py Co-authored-by: Adam Reeve <adreeve@gmail.com> Update python/pyarrow/tests/test_dataset_encryption.py Co-authored-by: Adam Reeve <adreeve@gmail.com> - adding a block to limit the scope of our lock from lines 147-152 in internal_file_decryptor.cc - changed #define ROW_COUNT 32769 to use constexpr in large_row_parquet_encrypt_test.cc simplify tests included a lock in WipeOutDecryptionKeys Update python/pyarrow/tests/test_dataset_encryption.py Co-authored-by: Antoine Pitrou <pitrou@free.fr> removing unneeded init
8677414 to
c8f2cc4
Compare
|
@github-actions crossbow submit -g python -g cpp |
|
Revision: c8f2cc4 Submitted crossbow builds: ursacomputing/crossbow @ actions-c569d19f76 |
pitrou
left a comment
There was a problem hiding this comment.
Thanks a lot @tolleybot . I'm going to merge after the latest update.
|
@raulcd Not a blocker, but if you ever need to do a RC3, this would be a good candidate for inclusion :-) |
|
After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit dd6d728. There were 8 benchmark results indicating a performance regression:
The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change:
This pull request addresses a critical issue (GH-39444) in the C++/Python components of Parquet, specifically a segmentation fault occurring when processing encrypted datasets over 2^15 rows. The fix involves modifications in
cpp/src/parquet/encryption/internal_file_decryptor.cc, particularly inInternalFileDecryptor::GetColumnDecryptor. The caching of theDecryptorobject was removed to resolve the multithreading issue causing the segmentation fault and encryption failures.What changes are included in this PR?
Decryptorobject caching inInternalFileDecryptor::GetColumnDecryptor.large_row_parquet_encrypt_test.ccfor C++ and an update totest_dataset_encryption.pywithtest_large_row_encryption_decryptionfor Python.Are these changes tested?
Yes, the unit tests (
large_row_parquet_encrypt_test.ccandtest_large_row_encryption_decryptionintest_dataset_encryption.py) have been added to ensure the reliability and effectiveness of these changes.Are there any user-facing changes?
No significant user-facing changes, but the update significantly improves the backend stability and reliability of Parquet file handling. Calling DecryptionKeyRetriever::GetKey could be an expensive operation potentially involving network calls to key management servers.