Skip to content

Conversation

@thebrianchen
Copy link

No description provided.

@thebrianchen thebrianchen self-assigned this Mar 10, 2021
@thebrianchen thebrianchen requested review from a team as code owners March 10, 2021 23:20
@product-auto-label product-auto-label bot added the api: firestore Issues related to the googleapis/nodejs-firestore API. label Mar 10, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Mar 10, 2021
@codecov
Copy link

codecov bot commented Mar 10, 2021

Codecov Report

Merging #1447 (dd00707) into master (a36bb09) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1447      +/-   ##
==========================================
+ Coverage   98.20%   98.22%   +0.01%     
==========================================
  Files          32       32              
  Lines       19509    19565      +56     
  Branches     1276     1283       +7     
==========================================
+ Hits        19159    19217      +58     
+ Misses        346      344       -2     
  Partials        4        4              
Impacted Files Coverage Δ
dev/src/backoff.ts 98.29% <100.00%> (ø)
dev/src/bulk-writer.ts 100.00% <100.00%> (ø)
dev/src/rate-limiter.ts 100.00% <0.00%> (+1.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a36bb09...dd00707. Read the comment docs.

// An array of pending write operations. Only contains writes that have not
// been resolved.
private pendingOps: Array<BulkWriterOperation> = [];
private _pendingOps: Array<BulkWriterOperation> = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be renamed to pendingOps and made readonly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Comment on lines 793 to 794
if (failureCount === 0) return 0;
if (useMaxDelay) return DEFAULT_BACKOFF_MAX_DELAY_MS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this is correct for our use case, this reads a bit strange. I would flip these two statements. That means that we would end up using useMaxDelay no matter what the failure count is, which I think is easier to understand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

return Math.min(
DEFAULT_BACKOFF_MAX_DELAY_MS,
backoffMs + Math.random() * 0.3 * backoffMs
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying the jitter to the entire duration of the backoff creates a pretty strong variance, and is different from how we compute the backoff jitter elsewhere. Can we use the last backoff and then only apply the jitter to the next section? This would either be:

DEFAULT_BACKOFF_INITIAL_DELAY_MS *
      Math.pow(DEFAULT_BACKOFF_FACTOR, failureCount-1) + (DEFAULT_BACKOFF_FACTOR * DEFAULT_BACKOFF_INITIAL_DELAY_MS  * jitter); (Maybe?)

or a simple backoff time in Delayed operation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the ExponentialBackoff class has even more variance. We use currentBaseMs + (Math.random-0.5)*currentBaseMs, where currentBaseMs = initialDelay * pow(1.5, numAttempts).

I modeled the BulkWriter approach after ExponentialBackoff, and used 0.3 instead of 0.5 to ensure that the backoff increases each time.

This is bit more of a nit (and it probably won't affect performance), but I feel that adding the delay with jitter rather than multiplying it goes against the spirit of exponential backoff, because 100x^k != 100x^(k-1)+x

get failedAttempts(): number {
return this._failedAttempts;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts/ideas:

  • Could each operation calculate its own backoff? We can then use a simpler calculation where each backoff depends on the previous backoff time (including jitter). You would just keep a simple "last_backoff_duration" in BulkWriterOperation and multiple it per failed attempt.

  • Would it be possible to use the Backoff class? We should just need to expose a method to return the backoff duration.

  • I think it would be cleaner if we stored a lastError instead of the more specific resourceExhausted. It is cleaner and more versatile for later modifications.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could each operation calculate its own backoff?

Done.

Would it be possible to use the Backoff class?

See my below comment on how the backoff used here is different than ExponentialBackoff. I don't think there is much to reuse in terms of logic here.

I think it would be cleaner if we stored a lastError instead of the more specific resourceExhausted.

Done.

});
});

it('applies maximum backoff on retries for RESOURCE_EXHAUSTED', async () => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add a test that uses multiple different backoffs in a batch and a test that verifies that we send a second batch before the first batch if the first batch has a backoff.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

/*!
* The default jitter to apply. For example, a factor of 0.3 means a 30% jitter
* is applied.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should mention that this relates to backoff. It needs some context in this module.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

private deferred = new Deferred<WriteResult>();
private failedAttempts = 0;
private _failedAttempts = 0;
private _lastError?: Status;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lastStatus?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return this.deferred.promise;
}

get lastError(): Status | undefined {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised this need to be exported.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, removed.

const delayMs = this._rateLimiter.getNextRequestDelayMs(
pendingBatch._opCount
);
const finalDelayMs = Math.max(backoffMsWithJitter, delayMs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move out of the if statement and then use it in the if(...) check itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const jitter =
Math.random() *
DEFAULT_JITTER_FACTOR *
(Math.round(Math.random()) ? 1 : -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Math.random() is not cheap. You can remove one call: Math.random() * 2 * DEFAULT_JITTER_FACTOR - DEFAULT_JITTER_FACTOR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh that's some nice mathing! Refactored it slightly to make it more understandable (at least to me).

(1 - DEFAULT_JITTER_FACTOR) * expected[timeoutHandlerCounter],
(1 + DEFAULT_JITTER_FACTOR) * expected[timeoutHandlerCounter]
);
timeoutHandlerCounter++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

class BulkWriterOperation {
private deferred = new Deferred<WriteResult>();
private failedAttempts = 0;
private _failedAttempts = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can rename this back to its old name since there is no getter anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: firestore Issues related to the googleapis/nodejs-firestore API. cla: yes This human has signed the Contributor License Agreement.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants