Skip to content

fix: Fix nested column obfuscation#20981

Merged
kodiakhq[bot] merged 4 commits intomainfrom
feature/eng-1671-bug-obfuscation-transformer-does-not-work-with-nested-array
Jul 9, 2025
Merged

fix: Fix nested column obfuscation#20981
kodiakhq[bot] merged 4 commits intomainfrom
feature/eng-1671-bug-obfuscation-transformer-does-not-work-with-nested-array

Conversation

@przste-go
Copy link
Copy Markdown

Summary

⚠️ If you're contributing to a plugin please read this section of the contribution guidelines 🧑‍🎓 before submitting this PR ⚠️

This PR fixes obfuscation of nested columns

@przste-go przste-go requested review from a team and jon-s58 July 8, 2025 14:23
str := column.ValueStr(i)
for _, jc := range jcs {
val := gjson.Get(column.ValueStr(i), jc.columnPath)
if val.Exists() && val.Type == gjson.String {
Copy link
Copy Markdown
Author

@przste-go przste-go Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check was biggest problem because gjson in case of smth.#.X queries will return an array rather than a string

updatedRecord.Column(0).(*array.String).Value(1))
assert.Equal(t,
fmt.Sprintf(`{"foo":{"bar":["%s ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb","%s 3e23e8160039594a33894f6564e1b1348bbd7a0088d42c4acb73eeaed59c009d","c"]},"hello":"world"}`, redactedByCQMessage, redactedByCQMessage),
fmt.Sprintf(`{"foo":{"bar":["%s ac8d8342bbb2362d13f0a559a3621bb407011368895164b628a54f7fc33fc43c","%s c100f95c1913f9c72fc1f4ef0847e1e723ffe0bde0b36e5f36c13f81fe8c26ed","c"]},"hello":"world"}`, redactedByCQMessage, redactedByCQMessage),
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we consider that a breaking change if it's only affecting json columns. I could probably make it backwards compatible if needed too

@przste-go
Copy link
Copy Markdown
Author

Only downside of this approach is that nested columns will be obfuscated with single sha value due to array being returned by gjson
{
"top_foo" : [ {
"foo" : "Redacted by CloudQuery | efe3e67a3f277ec61b7857edfa2df04488f78b8b9ef1ecd4b388ff656f143df3"
}, {
"foo" : "Redacted by CloudQuery | efe3e67a3f277ec61b7857edfa2df04488f78b8b9ef1ecd4b388ff656f143df3"
}, {
"foo" : "Redacted by CloudQuery | efe3e67a3f277ec61b7857edfa2df04488f78b8b9ef1ecd4b388ff656f143df3"
} ]
}

Comment on lines +436 to +437
if val.Exists() {
if modified, err := sjson.Set(str, jc.columnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(val.Raw)))); err == nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if val.Exists() {
if modified, err := sjson.Set(str, jc.columnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(val.Raw)))); err == nil {
if val.Exists() {
if val.Type != gjson.String {
// Check if value is an array and if so iterate through its elements and replace the # in the column path with the index
for i, arrayVal := range val.Array() {
updatedColumnPath := strings.Replace(jc.columnPath, "#", fmt.Sprintf("%d", i), 1)
if modified, err := sjson.Set(str, updatedColumnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(arrayVal.Raw)))); err == nil {
str = modified
}
}
} else {
if modified, err := sjson.Set(str, jc.columnPath, fmt.Sprintf("%s %x", redactedByCQMessage, sha256.Sum256([]byte(val.Raw)))); err == nil {
str = modified
continue
}
}

What about something like this?

Copy link
Copy Markdown
Author

@przste-go przste-go Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that kinda works for basic scenario, but for deeply nested cases like 'object1.object2.#.nested_object1.nested_object2.#.nested2_object1' we are receiving array of arrays of values '[[1,2],[3,4]]'
so we'd need replace # with indexes of elements.
like:
object1.object2.0.nested_object1.nested_object2.0.nested2_object1
object1.object2.0.nested_object1.nested_object2.1.nested2_object1
etc. Of course that's also possible to do.
On top of that to be fully compliant we'd need to handle #() which is also a valid query that let's you conditionally select elements.
image

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As nested json values are shouldn't be used as pk/pkc we decided with @murarustefaan to not implement that right now. But I've added todo to re-consider that in the future

@przste-go przste-go added the automerge Automatically merge once required checks pass label Jul 9, 2025
@kodiakhq kodiakhq bot merged commit 96d55cd into main Jul 9, 2025
15 checks passed
@kodiakhq kodiakhq bot deleted the feature/eng-1671-bug-obfuscation-transformer-does-not-work-with-nested-array branch July 9, 2025 10:29
kodiakhq bot pushed a commit that referenced this pull request Jul 9, 2025
🤖 I have created a release *beep* *boop*
---


## [2.6.1](plugins-transformer-basic-v2.6.0...plugins-transformer-basic-v2.6.1) (2025-07-09)


### Bug Fixes

* Fix nested column obfuscation ([#20981](#20981)) ([96d55cd](96d55cd))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge Automatically merge once required checks pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants