Skip to content

Conversation

@vgrozdanic
Copy link
Member

@vgrozdanic vgrozdanic commented Dec 1, 2025

Introduces encrypted version of Django's JSONField, and it is a drop-in replacement for it.

Since Django's field uses jsonb postgres field under the hood, we have to save encrypted value as a valid JSON, we can't save it as text or binary data.

Data is stored as jsonb in the database with the encrypted payload wrapped in a structure: {"sentry_encrypted_field_value": "enc:method:key_id:data"} (sentry_encrypted_field_value is picked to avoid conflicts with current key values, but even if it happens, and we can't decrypt it, we will return it as-is, we won't crash).

Basically how it works:

  • Encryption: Python dict object → JSON to string → encrypt → wrap in JSON object (under sentry_encrypted_field_value key) → store as jsonb
  • Decryption: load jsonb → check for wrapper (check if sentry_encrypted_field_value is present) → decrypt → parse JSON → Python dict object
  • Fallback: If no wrapper present, return parsed JSON as-is

Closes TET-512: Create EncryptedJSONField

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Dec 1, 2025
@vgrozdanic vgrozdanic requested review from a team and oioki December 1, 2025 17:40
@vgrozdanic vgrozdanic marked this pull request as ready for review December 1, 2025 17:40
@linear
Copy link

linear bot commented Dec 1, 2025

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of comments on purpose to make it easier to understand what is going on

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't totally understand the value of using a jsonb column here. It looks like we're just going to be storing the entire blob of data under one key name. In that case it may as well just be a text column.

How are we planning on using this field by the way? To some extent I wonder if it would be better to have this act as a normal jsonb field, but it encrypts only the values of keys that we specify in the model definition. A lot of our configs have just one key that should be encrypted, and it's generally useful to be able to see non-sensitive config in plaintext.

We could also have modes you set the field in. DEFAULT_ENCRYPT and DEFAULT_PLAINTEXT or something. Then you have a list of field that should be either encrypted or exempt from encryption, depending on the mode

@vgrozdanic
Copy link
Member Author

I don't totally understand the value of using a jsonb column here. It looks like we're just going to be storing the entire blob of data under one key name. In that case it may as well just be a text column.

Yes, but it has to be a drop-in replacement too for our current JSONField which is actually a jsonb column. It's better than performing complex migrations and double writing to two columns (for some time) if we would be using a text column.

How are we planning on using this field by the way? To some extent I wonder if it would be better to have this act as a normal jsonb field, but it encrypts only the values of keys that we specify in the model definition. A lot of our configs have just one key that should be encrypted, and it's generally useful to be able to see non-sensitive config in plaintext.

We could also have modes you set the field in. DEFAULT_ENCRYPT and DEFAULT_PLAINTEXT or something. Then you have a list of field that should be either encrypted or exempt from encryption, depending on the mode

I agree with this, that it would be better if everything but specific subset of keys was to be encrypted, but then this can explode in complexity in case we want to support nested keys too.

An argument against it is the simplicity of misuse, I was trying to make this as simple to use, and to make it as hard as possible to misuse, and i am afraid that using extra config parameters will quickly lead to misuse, or us forgetting to add another key to the config once we start to write another sensitive key to the field.

Since we are dealing with sensitive data, I would rather we go with overly cautious approach, and if there is a data that needs to be queried and filtered on, it can live in the separate column. @oioki what's your thoughts on this?

Also, fwiw, I have talked with ecosystem team which owns all of the tables that would for now use EncryptedJSONField, they said that they could build in some views into Sentry admin that will let them see config keys for debugging, but they are ok with having the whole column value encrypted.

@oioki
Copy link
Member

oioki commented Dec 2, 2025

I agree with @vgrozdanic here. It is easy to make a mistake and have new sensitive subfields unencrypted. The current implementation provides a middle-ground between encrypting the whole database row and subfields. If some new data has to be queryable and is non-sensitive, let's put it to a separate field not mixed with sensitive data.

@wedamija
Copy link
Member

wedamija commented Dec 2, 2025

I agree with @vgrozdanic here. It is easy to make a mistake and have new sensitive subfields unencrypted. The current implementation provides a middle-ground between encrypting the whole database row and subfields. If some new data has to be queryable and is non-sensitive, let's put it to a separate field not mixed with sensitive data.

Is this field going to be applied to all json fields? I think that you need to consult more with product engineers before pushing this onto folks. We've been without encrypted json fields for years, and I think it's worth taking the time to do it right now that we're putting in the effort.

@vgrozdanic
Copy link
Member Author

Is this field going to be applied to all json fields? I think that you need to consult more with product engineers before pushing this onto folks. We've been without encrypted json fields for years, and I think it's worth taking the time to do it right now that we're putting in the effort.

@wedamija no, just to a small subset that contain sensitive data (5 tables listed here, for options table - sensitive data shouldn't be there in a first place, so we'll figure that out separately, because that field also need to be migrated from LegacyJSONField). Because of the small overhead this adds, it's recommended to use it only when the field contains some data that needs to be encrypted.

For now only in the integration related tables we plan to migrate current json fields to encrypted version of it, and we have already consulted with Ecosystem team, and we had a backend TSC about this topic where no bigger concerns where raised.

@vgrozdanic vgrozdanic merged commit 341f0c1 into master Dec 3, 2025
66 checks passed
@vgrozdanic vgrozdanic deleted the vg/encrypted-json-field branch December 3, 2025 08:08
@github-actions github-actions bot locked and limited conversation to collaborators Dec 18, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants