fix: Fix nested column obfuscation#20981
Conversation
| str := column.ValueStr(i) | ||
| for _, jc := range jcs { | ||
| val := gjson.Get(column.ValueStr(i), jc.columnPath) | ||
| if val.Exists() && val.Type == gjson.String { |
There was a problem hiding this comment.
This check was biggest problem because gjson in case of smth.#.X queries will return an array rather than a string
| updatedRecord.Column(0).(*array.String).Value(1)) | ||
| assert.Equal(t, | ||
| fmt.Sprintf(`{"foo":{"bar":["%s ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb","%s 3e23e8160039594a33894f6564e1b1348bbd7a0088d42c4acb73eeaed59c009d","c"]},"hello":"world"}`, redactedByCQMessage, redactedByCQMessage), | ||
| fmt.Sprintf(`{"foo":{"bar":["%s ac8d8342bbb2362d13f0a559a3621bb407011368895164b628a54f7fc33fc43c","%s c100f95c1913f9c72fc1f4ef0847e1e723ffe0bde0b36e5f36c13f81fe8c26ed","c"]},"hello":"world"}`, redactedByCQMessage, redactedByCQMessage), |
There was a problem hiding this comment.
Not sure if we consider that a breaking change if it's only affecting json columns. I could probably make it backwards compatible if needed too
|
Only downside of this approach is that nested columns will be obfuscated with single sha value due to array being returned by gjson |
| if val.Exists() { | ||
| if modified, err := sjson.Set(str, jc.columnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(val.Raw)))); err == nil { |
There was a problem hiding this comment.
| if val.Exists() { | |
| if modified, err := sjson.Set(str, jc.columnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(val.Raw)))); err == nil { | |
| if val.Exists() { | |
| if val.Type != gjson.String { | |
| // Check if value is an array and if so iterate through its elements and replace the # in the column path with the index | |
| for i, arrayVal := range val.Array() { | |
| updatedColumnPath := strings.Replace(jc.columnPath, "#", fmt.Sprintf("%d", i), 1) | |
| if modified, err := sjson.Set(str, updatedColumnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(arrayVal.Raw)))); err == nil { | |
| str = modified | |
| } | |
| } | |
| } else { | |
| if modified, err := sjson.Set(str, jc.columnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(val.Raw)))); err == nil { | |
| str = modified | |
| continue | |
| } | |
| } |
What about something like this?
There was a problem hiding this comment.
Yeah, that kinda works for basic scenario, but for deeply nested cases like 'object1.object2.#.nested_object1.nested_object2.#.nested2_object1' we are receiving array of arrays of values '[[1,2],[3,4]]'
so we'd need replace # with indexes of elements.
like:
object1.object2.0.nested_object1.nested_object2.0.nested2_object1
object1.object2.0.nested_object1.nested_object2.1.nested2_object1
etc. Of course that's also possible to do.
On top of that to be fully compliant we'd need to handle #() which is also a valid query that let's you conditionally select elements.

There was a problem hiding this comment.
As nested json values are shouldn't be used as pk/pkc we decided with @murarustefaan to not implement that right now. But I've added todo to re-consider that in the future
…-does-not-work-with-nested-array
🤖 I have created a release *beep* *boop* --- ## [2.6.1](plugins-transformer-basic-v2.6.0...plugins-transformer-basic-v2.6.1) (2025-07-09) ### Bug Fixes * Fix nested column obfuscation ([#20981](#20981)) ([96d55cd](96d55cd)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Summary
This PR fixes obfuscation of nested columns