feature-1659: Initial draft#1680
Conversation
| decipher.init(Cipher.DECRYPT_MODE, secretKey, new GCMParameterSpec(tagSize, ivBytes)); | ||
| return decipher.doFinal(encryptedData); | ||
| } catch (Exception e) { | ||
| throw new RuntimeException("Error while decrypting data", e); |
There was a problem hiding this comment.
WDYT about creating a new exception class for encryption, like EncryptionException that extends ArcadeDBException (so it's a RuntimeException)
There was a problem hiding this comment.
Yes, will do like that
| public void serializeValue(final Database database, final Binary serialized, final byte type, Object value) { | ||
| if (value == null) | ||
| return; | ||
| Binary content = dataEncryption != null ? new Binary() : serialized; |
There was a problem hiding this comment.
Why do you need to create a new Binary() object in the case of encryption content?
| case BinaryTypes.TYPE_RID: | ||
| break; | ||
| default: | ||
| serialized.putBytes(dataEncryption.encrypt(content.toByteArray())); |
There was a problem hiding this comment.
I see, you're encrypting every single value. Why don't encrypting the whole Binary at the end, once all the values are serialized? This would be much faster and you don't need to differentiate special cases like RIDs. WDYT?
There was a problem hiding this comment.
Thanks, I'll have a look
There was a problem hiding this comment.
I was looking inside the code to find best place to do it, as Binary is referred in plenty of places, and it seemed that interactions with BaseRecord#buffer is what I am looking for. That however also is fairly nested inside many classes meaning it is difficult to intercept right moment to perform write/read as it gets passed around and often buffer is accessed directly to perform partial read.
In the end, I chosen implementing encryption at BinarySerializer in serializeProperties(), deserializeProperties(), deserializeProperty and hasProperty(). Idea is to write just all properties as encrypted at once and vice-versa. The issue I ran into is that serializeProperties() writes to two types: header and content. Header stores information about propertiesCount and about each propertyKey with its bufferPosition for the value. This allows reading just the value from whole buffer, and not all properties to find one. However this isn't possible with the goal above. I have to either re-factor the code to read all properties from buffer to decode, regardless of looking for specific one and then filter by key, or leave existing behaviour.
I could also re-factor structure of BaseRecord#binary to distinct between meta content (like ID) and data (encrypted) but that could be even more re-factoring, and it wouldn't resolve cherry picking properties anyway.
I understand that current implementation with encryption of each value adds performance cost, I didn't measure how this would affect our product yet, but it is mandatory requirement for us anyway.
What do you think?
There was a problem hiding this comment.
Sorry for the delay on this review. It makes sense, it's better to avoid encrypting the metadata for fast accessing to the properties you're requesting. Also, this makes easier to encrypt single property by adding some configuration (in the Schema -> Type -> Property)
|
Ok, I'm doing some tests before merging it. I'd like to provide a way to define encryption without using Java so it would work also in a standard client/server configuration. @gramian WDYT about it? |
|
Thanks, @pawellhasa for your contribution! |
|
I think the feature is useful. I have some questions:
|
|
To set the encryption now you have to call this API before using the database (right after open): database.setDataEncryption(DefaultDataEncryption.useDefaults(DefaultDataEncryption.getSecretKeyFromPasswordUsingDefaults(password, salt)));Good idea about a micro benchmark to see what's the cost of writing and reading with the default settings. Also, if we provide a new boolean property
@pawellhasa WDYT? |
|
Selecting which properties to encrypt can help avoiding encryption computing cost. I am away so won't be able to properly think about it till next week. I'll find out impact on our end and whether we would consider encryption for just part of data worth it. |
|
NP. I prefer to complete this part and be done in terms of API (and SQL) before updating the docs = other users start using it. |
|
Atm we don't want to exclude properties from encrypting in our product, if something changes, we'll discuss it |
…n /studio [skip ci] Bumps [terser](https://github.com/terser/terser) from 5.46.2 to 5.47.1. Changelog *Sourced from [terser's changelog](https://github.com/terser/terser/blob/master/CHANGELOG.md).* > v5.47.1 > ------- > > * Fix crash when using `mangle.keep_fnames` with destructuring > > v5.47.0 > ------- > > * Add `builtins_ecma` and `builtins_pure` options > * Add Intl options to domprops ([#1680](https://redirect.github.com/terser/terser/issues/1680)) Commits * [`bf949e7`](terser/terser@bf949e7) 5.47.1 * [`23bb72e`](terser/terser@23bb72e) update changelog * [`1fd2134`](terser/terser@1fd2134) fix crash when using `mangle.keep_fnames` with destructuring. Closes [#1681](https://redirect.github.com/terser/terser/issues/1681) * [`7cbd24d`](terser/terser@7cbd24d) 5.47.0 * [`b1bc6bd`](terser/terser@b1bc6bd) update changelog * [`be36c87`](terser/terser@be36c87) add "builtins" and "builtins\_pure" options ([#1651](https://redirect.github.com/terser/terser/issues/1651)) * [`2d52ff3`](terser/terser@2d52ff3) add Intl option keys (DurationFormat, DateTimeFormat, RelativeTimeFormat) to ... * See full diff in [compare view](terser/terser@v5.46.2...v5.47.1) [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- Dependabot commands and options You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
What does this PR do?
Initial draft for encryption implementation at serialisation level
Motivation
Provide REST encryption
Related issues
#1659
Additional Notes
This is draft, so I'm looking for suggestions
Checklist
mvn clean packagecommand