ENH: Skip MD5 key derivation for AES-256 encrypted PDFs#3694
ENH: Skip MD5 key derivation for AES-256 encrypted PDFs#3694stefan6419846 merged 1 commit intopy-pdf:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3694 +/- ##
=======================================
Coverage 97.43% 97.43%
=======================================
Files 55 55
Lines 10005 10008 +3
Branches 1838 1839 +1
=======================================
+ Hits 9748 9751 +3
Misses 149 149
Partials 108 108 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks for the report and PR. I guess this relates to section 7.6.3.3 (Algorithm 1.A) of the PDF 2.0 specification? Could we document such a reference inside the code please? Additionally, are we able to somehow add a corresponding test for this to ensure it does not accidentally break? |
daa8df7 to
5cd75cd
Compare
|
@stefan6419846 Thanks, I've added a note there explaining why the MD5 computation is guarded for V>=5. For the test, I've added test_aes256_decrypt_does_not_call_md5 which monkeypatches hashlib.md5 to raise a ValueError (simulating FIPS) |
5cd75cd to
45ac2f7
Compare
For V>=5 PDFs, the encryption key is used directly without MD5. MD5 computation in _make_crypt_filter() only runs for V<=4, allowing AES-256 encrypted PDFs to be read on FIPS-enabled systems where hashlib.md5() is blocked. RC4 and AES-128 encrypted PDFs will still correctly fail on FIPS systems, as their key requires MD5.
45ac2f7 to
b1db9f9
Compare
## What's new ### Security (SEC) - Disallow custom XML entity declarations for XMP metadata (#3724) by @stefan6419846 ### New Features (ENH) - Skip MD5 key derivation for AES-256 encrypted PDFs (#3694) by @Ygnas ### Bug Fixes (BUG) - Use remove_orphans in compress_identical_objects (#3310) by @j-t-1 - Fix PdfReadError when xref table contains comments before trailer (#3710) by @rassie - Correctly verify AES padding during decryption (#3699) by @stefan6419846 - Fix stale object cache from non-authoritative object streams (#3698) by @astahlman - Fix extract_links pairing when annotations include non-links (#3687) by @ReinerBRO ### Documentation (DOC) - Add AI policy (#3717) by @stefan6419846 [Full Changelog](6.9.2...6.10.0)
For V>=5 PDFs, the encryption key is used directly without MD5. MD5 computation in _make_crypt_filter() only runs for V<=4, allowing AES-256 encrypted PDFs to be read on FIPS-enabled systems where hashlib.md5() is blocked.
RC4 and AES-128 encrypted PDFs will still correctly fail on FIPS systems, as their key requires MD5.