-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Fixed tracking of command duration for multi/eval/module/wait #11970
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
good catch - I missed it (after struggling to adjust lua to start calling resetClient) |
|
maybe we can also zero the duration when we are updating stats? |
|
i think i agree, with Ran, it'll look better if we zero it right after using it (explicitly in |
I moved it to a serverAssert(), maybe it will catch a place where we miss. Missed one case related to |
classic case to place a "debugAssert" which we do not have :) |
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i'd prefer a simpler approach of clearing it in obvious places:
- creteClient, resetClient
- and after using the value (i.e.
updateStatsOnUnblockandcall)
not sure we need the assertion...
The problem is that the "obvious places" aren't really obvious anymore.
I would have agreed if it didn't find another issue with wait. It wasn't critical, but still |
|
You mean the condition that ended up in Anyway. OK. Go ahead and merge it.. |
|
|
||
| /* Deallocate structures used to block on blocking ops. */ | ||
| /* If there is any in-flight command, we don't don't record their duration. */ | ||
| c->duration = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started running into issues related to module APIs that had resetClient() being called when a client with a pending blocking RM_Call() was getting freed, which triggered the resetClient() call path (which crashed since it had an unrecorded command). There might be a more elegant way to fix it, but I ended up just zero'ing the duration when the client was being freed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't understand. are we calling resetClient from within / after freeClient?
same thing for moduleReleaseTempClient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, resetClient is being called from within freeClient, because freeClient calls unbockClient which calls resetClient here: https://github.com/redis/redis/blob/unstable/src/blocked.c#L211.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
but now that we fixed the problem with WAIT, can we (at least in theory), replace that assert with a =0, and drop many other changes?
i'm ok with to keep the assert, just curious since i'm not looking at this PR with much care..
ranshid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thank you for fixing this!
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please mention the change about unblockClientWaitingReplicas in the top comment.
In #11012, we changed the way command durations were computed to handle the same command being executed multiple times. This commit fixes some misses from that commit.
This commit also adds an assert if the duration is not properly reset, potentially indicating that a report to call statistics was missed. The assert potentially be removed in the future, as it's mainly intended to detect misses in tests.
Before:
After: