-
Notifications
You must be signed in to change notification settings - Fork 161
chore: simplify Bulk Writer #1379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
28204df to
2900034
Compare
Codecov Report
@@ Coverage Diff @@
## master #1379 +/- ##
==========================================
- Coverage 98.52% 98.51% -0.01%
==========================================
Files 32 32
Lines 19464 19341 -123
Branches 1374 1357 -17
==========================================
- Hits 19176 19053 -123
Misses 284 284
Partials 4 4
Continue to review full report at Codecov.
|
2900034 to
b480bf7
Compare
b480bf7 to
dffa081
Compare
|
Updated PR to remove state transitions (they are now handled internally in BulkCommitOperation). Also removed list of pending operations and instead added a pointer to the last scheduled operation. |
ca36340 to
ab5c463
Compare
thebrianchen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, after taking the time to understand this, I am speechless. This is so sleek, so clean. Thanks for putting the time into this!
dev/src/bulk-writer.ts
Outdated
| /** | ||
| * @param ref The document reference being written to. | ||
| * @param type The type of operation that created this write. | ||
| * @param op A closure that encapsulates the API call which adds this write to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lint fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dev/src/bulk-writer.ts
Outdated
| constructor( | ||
| public ref: firestore.DocumentReference<unknown>, | ||
| private type: 'create' | 'set' | 'update' | 'delete', | ||
| private op: (bulkCommitBatch: BulkCommitBatch) => void, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add readonly where applicable here and below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (it's applicable everywhere)
dev/src/bulk-writer.ts
Outdated
| * when the batch's response is received. This includes batches from both the | ||
| * batchQueue and retryBatchQueue. | ||
| * The batch that is currently used to schedule operations. Once this batch | ||
| * reached maximum capacity, a new batch is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/reached/reaches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| // TODO(b/158502664): Use actual delete timestamp. | ||
| const DELETE_TIMESTAMP_SENTINEL = Timestamp.fromMillis(0); | ||
|
|
||
| const status = (response.status || [])[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In L216, we access status.code, but based on L208, status could be undefined (though it shouldn't normally happen). Should we add an assert statement before accessing status.code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the code is undefined, we pass it as undefined to the customer. That seems to be the reasonable thing to do, since we don't want to make up our own status code. But as you said, the status code should always be set.
dev/src/bulk-writer.ts
Outdated
| /** | ||
| * Sets the maximum number of allowed operations in a batch. | ||
| * Schedules the provided operations on current BulkCommitBatch. | ||
| * Sends the BulkCommitBatch if it reach maximum capacity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/reach/reaches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| ref, | ||
| type, | ||
| op, | ||
| this._sendFn.bind(this), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is there a behavioral difference between using bind here vs. using a lambda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I was just hoping prettier wouldn't span this across 6 lines, but I was wrong. Now I am too lazy to go back.
dev/src/bulk-writer.ts
Outdated
| * reference, or creates one if no eligible batches are found. | ||
| * Sends the current batch and resets `this._bulkCommitBatch`. | ||
| * | ||
| * @param flush - If provided, keeps re-sending operations until no more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove -.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| return this._rateLimiter; | ||
| op.run(this._bulkCommitBatch); | ||
| this._bulkCommitBatch.processLastOperation(op); | ||
| this._lastOp = this._lastOp.then(() => silencePromise(op.promise)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here about advancing the pointer? It took me a bit to realize what was happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dev/src/bulk-writer.ts
Outdated
| pendingBatch._opCount | ||
| ); | ||
| logger( | ||
| 'BulkWriter.sendNextBatch', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/sendNextBatch/sendCurrentBatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for catching.
schmidt-sebastian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
dev/src/bulk-writer.ts
Outdated
| /** | ||
| * @param ref The document reference being written to. | ||
| * @param type The type of operation that created this write. | ||
| * @param op A closure that encapsulates the API call which adds this write to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dev/src/bulk-writer.ts
Outdated
| constructor( | ||
| public ref: firestore.DocumentReference<unknown>, | ||
| private type: 'create' | 'set' | 'update' | 'delete', | ||
| private op: (bulkCommitBatch: BulkCommitBatch) => void, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (it's applicable everywhere)
| // TODO(b/158502664): Use actual delete timestamp. | ||
| const DELETE_TIMESTAMP_SENTINEL = Timestamp.fromMillis(0); | ||
|
|
||
| const status = (response.status || [])[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the code is undefined, we pass it as undefined to the customer. That seems to be the reasonable thing to do, since we don't want to make up our own status code. But as you said, the status code should always be set.
dev/src/bulk-writer.ts
Outdated
| * when the batch's response is received. This includes batches from both the | ||
| * batchQueue and retryBatchQueue. | ||
| * The batch that is currently used to schedule operations. Once this batch | ||
| * reached maximum capacity, a new batch is created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dev/src/bulk-writer.ts
Outdated
| * reference, or creates one if no eligible batches are found. | ||
| * Sends the current batch and resets `this._bulkCommitBatch`. | ||
| * | ||
| * @param flush - If provided, keeps re-sending operations until no more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dev/src/bulk-writer.ts
Outdated
| pendingBatch._opCount | ||
| ); | ||
| logger( | ||
| 'BulkWriter.sendNextBatch', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for catching.
| ref, | ||
| type, | ||
| op, | ||
| this._sendFn.bind(this), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I was just hoping prettier wouldn't span this across 6 lines, but I was wrong. Now I am too lazy to go back.
dev/src/bulk-writer.ts
Outdated
| /** | ||
| * Sets the maximum number of allowed operations in a batch. | ||
| * Schedules the provided operations on current BulkCommitBatch. | ||
| * Sends the BulkCommitBatch if it reach maximum capacity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| return this._rateLimiter; | ||
| op.run(this._bulkCommitBatch); | ||
| this._bulkCommitBatch.processLastOperation(op); | ||
| this._lastOp = this._lastOp.then(() => silencePromise(op.promise)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
I can no longer create a new BulkWriter |
|
@miketierce This PR hasn't been released yet. If the issue persists, can you file an issue in this repo? |
This is my attempt to simplify some of the callback logic in BulkWriter, and will hopefully allow us add Backoff handling for retries pretty easily. The overall changes are: - There are no more batch queues. Instead, there is only ever a single batch. If this batch reaches capacity, it is send. - To allow 'flush' to block, I am instead keeping a list of pending operations. One behavior change is that it is now possible that a retry will get scheduled after other operations in a flush. Let's say we have the calls: - Op1 - Op2 - Flush - Op3 And Op1 fails. The write order could now be: Op1, Op2, Op3, Op1. This breaks a single unit test, but I think it is sane behavior as it allows us to make progress while we wait for more reponses. Port of googleapis/nodejs-firestore#1379
This is my attempt to simplify some of the callback logic in BulkWriter, and will hopefully allow us add Backoff handling for retries pretty easily.
The overall changes are:
One behavior change is that it is now possible that a retry will get scheduled after other operations in a flush. Let's say we have the calls:
And Op1 fails. The write order could now be: Op1, Op2, Op3, Op1. This breaks a single unit test, but I think it is sane behavior as it allows us to make progress while we wait for more reponses.