-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
feat(encryption): Add EncryptedJSONField #104167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of comments on purpose to make it easier to understand what is going on
wedamija
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't totally understand the value of using a jsonb column here. It looks like we're just going to be storing the entire blob of data under one key name. In that case it may as well just be a text column.
How are we planning on using this field by the way? To some extent I wonder if it would be better to have this act as a normal jsonb field, but it encrypts only the values of keys that we specify in the model definition. A lot of our configs have just one key that should be encrypted, and it's generally useful to be able to see non-sensitive config in plaintext.
We could also have modes you set the field in. DEFAULT_ENCRYPT and DEFAULT_PLAINTEXT or something. Then you have a list of field that should be either encrypted or exempt from encryption, depending on the mode
Yes, but it has to be a drop-in replacement too for our current
I agree with this, that it would be better if everything but specific subset of keys was to be encrypted, but then this can explode in complexity in case we want to support nested keys too. An argument against it is the simplicity of misuse, I was trying to make this as simple to use, and to make it as hard as possible to misuse, and i am afraid that using extra config parameters will quickly lead to misuse, or us forgetting to add another key to the config once we start to write another sensitive key to the field. Since we are dealing with sensitive data, I would rather we go with overly cautious approach, and if there is a data that needs to be queried and filtered on, it can live in the separate column. @oioki what's your thoughts on this? Also, fwiw, I have talked with ecosystem team which owns all of the tables that would for now use |
|
I agree with @vgrozdanic here. It is easy to make a mistake and have new sensitive subfields unencrypted. The current implementation provides a middle-ground between encrypting the whole database row and subfields. If some new data has to be queryable and is non-sensitive, let's put it to a separate field not mixed with sensitive data. |
Is this field going to be applied to all json fields? I think that you need to consult more with product engineers before pushing this onto folks. We've been without encrypted json fields for years, and I think it's worth taking the time to do it right now that we're putting in the effort. |
@wedamija no, just to a small subset that contain sensitive data (5 tables listed here, for options table - sensitive data shouldn't be there in a first place, so we'll figure that out separately, because that field also need to be migrated from For now only in the integration related tables we plan to migrate current json fields to encrypted version of it, and we have already consulted with Ecosystem team, and we had a backend TSC about this topic where no bigger concerns where raised. |
Introduces encrypted version of Django's
JSONField, and it is a drop-in replacement for it.Since Django's field uses
jsonbpostgres field under the hood, we have to save encrypted value as a valid JSON, we can't save it as text or binary data.Data is stored as jsonb in the database with the encrypted payload wrapped in a structure:
{"sentry_encrypted_field_value": "enc:method:key_id:data"}(sentry_encrypted_field_valueis picked to avoid conflicts with current key values, but even if it happens, and we can't decrypt it, we will return it as-is, we won't crash).Basically how it works:
sentry_encrypted_field_valuekey) → store as jsonbsentry_encrypted_field_valueis present) → decrypt → parse JSON → Python dict objectCloses TET-512: Create EncryptedJSONField