changefeedccl: avoid telemetry double count on txn retry#91812
changefeedccl: avoid telemetry double count on txn retry#91812stevendanna wants to merge 1 commit intocockroachdb:masterfrom
Conversation
| } | ||
|
|
||
| func (a *telemetryAggregator) CountBucketed(path string, value int64) { | ||
| a.txn.AddCommitTrigger(func(context.Context) { |
There was a problem hiding this comment.
wouldn't that add multiple triggers if txn retries?
There was a problem hiding this comment.
We clear out the transaction when we prepare for retry. But, your comment prompted me to go look and there is a code comment that points to #18170 which indicates maybe we might have a problem still along some code paths.
Perhaps we can install a single callback and use something like sync.Once to make sure we only call it once even if it doesn't get reset.
There was a problem hiding this comment.
I've pushed a test that checks that we don't double count on a retry-like event (I rollback to a SAVEPOINT at the start of the txn).
Still not sure I love using AddCommitTrigger for this though. I suppose one question is what we want the semantics of these counters to be. For instance, it seems reasonable that an automatic txn retry isn't double counted. But the case in the test I added is less clear. If the user interactively typed the statement twice and just happened to rollback. Using AddCommitTrigger also have the side-effect of only counting it in the case of a successful commit. Which, might be a bit odd if we are just trying to get a sense of what features are being used.
There was a problem hiding this comment.
I agree re txn.AddCommitTrigger -- not a fan of it.
I was about to suggest sync.Once;
What do you think about something like https://gist.github.com/miretskiy/2800e81ea4ce779eb02a86d21e143879 instead? Just use withTelemetryUpdate fn in alter statement
This sounds like a bigger problem though, right? Every hook based statement that updates telemetry (which is every hook based statement) will be impacted? Do we have an issue for that?
There was a problem hiding this comment.
This sounds like a bigger problem though, right?
Yes, my guess is it affects all of the hook based statements.
There was a problem hiding this comment.
What do you think about something like gist.github.com/miretskiy/2800e81ea4ce779eb02a86d21e143879 instead? Just use withTelemetryUpdate fn in alter statement
I'm not sure that works because I believe the entire plan hook gets re-run on a retry.
Previously, if the ALTER CHANGEFEED txn retried, we would increment these counters again. In cockroachdb#91563 we are making a change that made it more likely that this transaction may retry during one of the test, revealing the issue. Release note: None
ab6604c to
e1b2b67
Compare
|
If we do something here, it won't be this. |
Previously, if the ALTER CHANGEFEED txn retried, we would increment these counters again.
In #91563 we are making a change that made it more likely that this transaction may retry during one of the test, revealing the issue.
Epic: None
Release note: None