fix: Change logic for batch writing to write when batch size is reached, not exceeded#2153
fix: Change logic for batch writing to write when batch size is reached, not exceeded#2153kodiakhq[bot] merged 3 commits intomainfrom
Conversation
murarustefaan
left a comment
There was a problem hiding this comment.
Can't the described behavior be fixed by running tests in destinations with lower batch sizes / timeouts, rather than configuring the same delete message to be sent more times than it's needed?
8daf1f9 to
b95f7ee
Compare
|
@murarustefaan Yeah I guess that's a fair point. I updated the PR to change the logic for batch writers to use |
murarustefaan
left a comment
There was a problem hiding this comment.
Yeah I don't think this will impact anything, it will flush at 1 instead of 2 for batch_size: 1and at 5000 instead of 5001 at batch_size: 5000
|
Could you change the title and description of this PR, @maaarcelino ? |
erezrokah
left a comment
There was a problem hiding this comment.
Agree with the change, seems like a left over from when we refactored the code to support multi row batches.
I would first try to update a few destinations (e.g. Postgres, S3 and MySQL) to this branch sha to see if this triggers any test failures, as existing tests might rely on this behavior
writers/batchwriter/batchwriter.go
Outdated
| l := int64(len(w.deleteRecordMessages)) | ||
| w.deleteRecordLock.Unlock() | ||
| if w.batchSize > 0 && l > w.batchSize { | ||
| if w.batchSize > 0 && l >= w.batchSize { |
There was a problem hiding this comment.
| if w.batchSize > 0 && l >= w.batchSize { | |
| limit := batch.CappedAt(0, w.batchSize) | |
| limit.AddRows(l) | |
| if limit.ReachedLimit() { |
I think this is more aligned with the rest of the codebase e.g.
same comment for other places where we check the limit
There was a problem hiding this comment.
Done in 05b4212. Tested and it works correctly
🤖 I have created a release *beep* *boop* --- ## [4.80.2](v4.80.1...v4.80.2) (2025-05-19) ### Bug Fixes * Change logic for batch writing to write when batch size is reached, not exceeded ([#2153](#2153)) ([58c8a1e](58c8a1e)) * Flush DeleteRecord messages when batch writer is flushed ([#2154](#2154)) ([791c865](791c865)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
The current logic for batch writing prevents us from cleanly testing the
DeleteRecordhandling in the plugin. Currently we have to setBatchSize: 1and send twoDeleteRecordmessages in the test, which is not the cleanest logic. This PR changes the logic so the batch is flushed when the batch size is reached, not exceeded (so forBatchSize: 1it will flush after oneDeleteRecordis received).This also matches the logic we have in https://github.com/cloudquery/plugin-sdk/blob/main/internal/batch/cap.go#L7.