Skip to content

Conversation

@huangzhw
Copy link
Contributor

@huangzhw huangzhw commented Aug 29, 2021

Tracking invalidation messages were sometimes sent in inconsistent order, before the command's reply rather than after.
In addition to that, they were sometimes embedded inside other commands responses, like MULTI-EXEC and MGET.
Fix #8206 and #8935

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would like to ask @madolson to take a look too.
and i would like to ask to perform some trivial benchmark to check the impact of this.

@oranagra
Copy link
Member

@huangzhw what's the content of that last force-push? is it just a rebase from unstable or anything else?

p.s. we usually merge PRs with squash-merge, so it's more convenient to review if there are no force-pushes, and i'd rather merge unstable into the PR instead of a rebase.

@huangzhw
Copy link
Contributor Author

I just rebase unstable as there are conflicts. Only the last commit is new.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
@madolson @yossigo @soloestoy does one of you want to run a second pair of eyes on this?
in theory there's a potential for a temp memory spike, since we flush that list quite frequently, i don't think that's a realistic concern.
There's also an additional lookupClientByID, but that's kinda trivial too i think.

@oranagra oranagra added release-notes indication that this issue needs to be mentioned in the release notes state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten approval-needed Waiting for core team approval to be merged labels Sep 19, 2021
@oranagra
Copy link
Member

oranagra commented Sep 19, 2021

triggered daily CI: https://github.com/redis/redis/actions/runs/1250884079 (failures are unrelated to this PR)

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I'm also not clear about the direction where we are deferring the sending of the invalidation, and not deferring the invalidation itself until outside of the command execution, which seems a little more straight forward of an implementation.

@oranagra
Copy link
Member

I suppose I'm also not clear about the direction where we are deferring the sending of the invalidation, and not deferring the invalidation itself until outside of the command execution, which seems a little more straight forward of an implementation.

@madolson assuming i understand what you mean, and i remember the details:
deferring the invalidation (the decision of whatever to send a message or not), rather than just the message itself would cause us to make the wrong decisions (skip sending messages that we should have sent)

huangzhw and others added 2 commits October 3, 2021 20:19
Co-authored-by: Oran Agra <oran@redislabs.com>
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is ready to be merged.
any last concerns?

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no concerns, took another pass through and it LGTM.

@oranagra oranagra merged commit fd135f3 into redis:unstable Oct 7, 2021
@huangzhw huangzhw deleted the tracking branch October 7, 2021 12:44
@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

@huangzhw one of the new tests in this PR seems to have a rare timing issue:
https://github.com/redis/redis/runs/4068817198?check_suite_focus=true

*** [err]: Tracking invalidation message of eviction keys should be before response in tests/unit/tracking.tcl
Expected '0' to be equal to 'invalidate volatile-key' (context: type eval line 21 cmd {assert_equal $res {invalidate volatile-key}} proc ::test)

can you please take a look

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

interestingly, it looks like the failure is semi consistent now, happening a lot in one of the external tests (either clustered or not clustered)
https://github.com/redis/redis/runs/4078424493?check_suite_focus=true

@huangzhw
Copy link
Contributor Author

huangzhw commented Nov 2, 2021

I have a question. In external test between every start_server is the data set clean?

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

yes, look for flushall in server.tcl

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

looking at the test code i think i may realized the problem.
the line that does set used [s used_memory] at the beginning of the test may get the wrong idea if there's some lazy eviction going on in the background.

so maybe we need to copy this code from lazyfree.tcl:

        # make the previous test is really done before sampling used_memory
        wait_for_condition 50 100 {
            [s lazyfree_pending_objects] == 0
        } else {
            fail "lazyfree isn't done"
        }

or maybe even promote that to a function in util.tcl to be shared between them and maybe others in the future...

@huangzhw
Copy link
Contributor Author

huangzhw commented Nov 2, 2021

I check the code before this test in tracking.tcl. I think no tests cause this. I runned this test with --loop and external, I don't get failures. So I suspect other tests cause this.

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

yes, the fact it only happens in external mode, is also a good indication it depends on other tests.
i posted a hypothesis above, and a PR to attempt to fix it: #9722

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

Looks like my hypothesis was wrong https://github.com/redis/redis/actions/runs/1412769681
Still failing...
I guess we must add some prints and reproduce it so we know what's going on..

@oranagra
Copy link
Member

oranagra commented Nov 2, 2021

found the problem: #9726

This was referenced Apr 27, 2022
oranagra pushed a commit that referenced this pull request Apr 27, 2022
Tracking invalidation messages were sometimes sent in inconsistent order,
before the command's reply rather than after.
In addition to that, they were sometimes embedded inside other commands
responses, like MULTI-EXEC and MGET.

(cherry picked from commit fd135f3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Expiration based invalidation PUSH messages sometimes embedded within other replies. Tracking invalidations order is not consistent across commands

4 participants