PARQUET-1477: Thrift crypto updates#124
Conversation
| 2: optional binary footer_key_metadata | ||
|
|
||
| /** Offset of encrypted Parquet footer **/ | ||
| 3: required i64 footer_offset |
There was a problem hiding this comment.
The 'footer_offset' field still exists in the diagram at the section 'Encrypted footer mode' of the design document (https://github.com/ggershinsky/parquet-format/blob/p1232-encryption-docs/Encryption.md).
There was a problem hiding this comment.
this is an outdated link. the current doc is at
https://github.com/apache/parquet-format/blob/encryption/Encryption.md
There was a problem hiding this comment.
Please update the Parquet-1178(https://docs.google.com/document/d/1T89G7xR0zHFV1f2pjTO28jtfVm8qoNVGEJQ70Rsk-bY) to let it point to the current doc.
| * Retrieval metadata of key used for signing the footer. | ||
| * Used only in encrypted files with plaintext footer. | ||
| */ | ||
| 9: optional binary footer_signing_key_metadata |
There was a problem hiding this comment.
This is not mentioned in the diagram 'Plaintext footer mode' of the design.
There was a problem hiding this comment.
|
|
||
| /** In files encrypted with AAD prefix without storing it, | ||
| * readers must supply the prefix **/ | ||
| 3: optional bool supply_aad_prefix |
There was a problem hiding this comment.
Are 'supply_aad_prefix' and 'aad_prefix'/'aad_file_unique' mutual exclusive? If yes, should we make it as union?
| @@ -863,23 +869,27 @@ struct ColumnIndex { | |||
| } | |||
|
|
|||
| struct AesGcmV1 { | |||
There was a problem hiding this comment.
AesGcmV1 and AesGcmCtrV1 seems identical. If they are same, can we just use one structure instead of duplicating?
There was a problem hiding this comment.
- we use them to specify two different algorithms. since we dont use enums, its a union with two structs
- the design is signed off. we shouldn't change it, lets only make sure the Thrift pr conforms to the merged design.
|
@shangxinli, please have a look at the new doc for encryption. If you want to know more about some of the choices in that doc, please look at #114 where most of the discussion happened. Let's focus in these pull requests on getting the code to match the doc. |
Thanks. I just looked at the new doc and reviewed the pr again. It looks good to me. I also tested it with a spark application by encrypting a column and decrypt it. It works. |
|
@shangxinli Thanks for the review and test, sounds good. |
gszadovszky
left a comment
There was a problem hiding this comment.
It looks good to me.
Will wait an additional 24 hours and merge to the feature branch.
No description provided.