-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Sentinel: release the previous connection with the instance when setting password #8710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you @yz1509, do you think you can add a test for this case? |
|
@yossigo
Script def monitor_master(sentinel_host, sentinel_port, master_host, master_port):
r = redis.Redis(sentinel_host, sentinel_port)
r.execute_command('SENTINEL', 'monitor', 'test', master_host, str(master_port), '2')
# Adding sleep makes it easier to reproduce
# make sure https://github.com/redis/redis/blob/6.2.1/src/sentinel.c#L2413 is executed before `SENTINEL set test auth-pass kumiko`
time.sleep(1)
r.execute_command('SENTINEL', 'set', 'test', 'auth-pass', 'kumiko')
# ...
# ...Sentinel log // 3.2.11
50509:X 14 Apr 20:26:44.205 # +monitor master test 127.0.0.147 6309 quorum 2
50509:X 14 Apr 20:26:45.254 # +set master test 127.0.0.147 6309 auth-pass kumiko
50509:X 14 Apr 20:26:45.302 - Client closed connection
// 15 seconds later
50509:X 14 Apr 20:26:59.324 * +slave slave 127.0.0.73:6309 127.0.0.73 6309 @ test 127.0.0.147 6309
50509:X 14 Apr 20:26:59.324 * +slave slave 127.0.0.50:6309 127.0.0.50 6309 @ test 127.0.0.147 6309
// 6.2.1
49270:X 14 Apr 2021 20:27:20.955 # +monitor master test 127.0.0.147 6309 quorum 2
49270:X 14 Apr 2021 20:27:22.003 # +set master test 127.0.0.147 6309 auth-pass kumiko
49270:X 14 Apr 2021 20:27:22.004 . Rewritten config file (/home/yz1509/redis/6.2.1/sentinel.conf) successfully
49270:X 14 Apr 2021 20:27:22.052 - Client closed connection
49270:X 14 Apr 2021 20:27:24.167 * +sentinel sentinel 9be34ec15fdecfbe0aadee37d7985ac942a8d838 127.0.0.73 13211 @ test 127.0.0.147 6309
49270:X 14 Apr 2021 20:27:24.167 . Rewritten config file (/home/yz1509/redis/6.2.1/sentinel.conf) successfully
49270:X 14 Apr 2021 20:27:26.262 - Accepted 127.0.0.73:47265
// sdown
49270:X 14 Apr 2021 20:27:50.996 # +sdown master test 127.0.0.147 6309
// current
33521:X 15 Apr 2021 16:13:26.710 . Rewritten config file (/home/yz1509/redis/current/sentinel.conf) successfully
33521:X 15 Apr 2021 16:13:26.710 # +monitor master test 127.0.0.147 6309 quorum 2
33521:X 15 Apr 2021 16:13:27.763 # +set master test 127.0.0.147 6309 auth-pass kumiko
33521:X 15 Apr 2021 16:13:27.763 . Rewritten config file (/home/yz1509/redis/current/sentinel.conf) successfully
33521:X 15 Apr 2021 16:13:27.801 * +slave slave 127.0.0.73:6309 127.0.0.73 6309 @ test 127.0.0.147 6309
33521:X 15 Apr 2021 16:13:27.802 . Rewritten config file (/home/yz1509/redis/current/sentinel.conf) successfully
33521:X 15 Apr 2021 16:13:27.802 * +slave slave 127.0.0.50:6309 127.0.0.50 6309 @ test 127.0.0.147 6309
33521:X 15 Apr 2021 16:13:27.803 . Rewritten config file (/home/yz1509/redis/current/sentinel.conf) successfully |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @yz1509 , I have three comments for this PR:
- Instead of only disconnecting with masters, I think you also need to disconnect with all slaves(but not sentinels) since the group of redis instances for a single master and a several replica are sharing with the same credenticals when sentinel trying to do authentication.You might need to do similar logic like this: https://github.com/redis/redis/blob/unstable/src/sentinel.c#L3194
- I think setting the auth-pass you also need to disconnect with the master/replicas.
- Can you provide some simple unit test for this? please check: https://github.com/redis/redis/tree/unstable/tests/sentinel for some existing examples.
@yossigo do you also have other comments? thanks
|
@hwware Thanks for your comments.
|
|
@hwware @yz1509
What do you mean? Force replication to restart? |
@yossigo sorry i type it wrong, I mean auth-user :)
@yz1509 @yossigo This is true when sentinel cannot get authenticated with master at first time, but I think we need to pay attention to some edge cases, for example, when master does not have a password, but replica has, sentinel can get replica info and connect with masters but not able connect with replicas, later when we set auth-pass to be replica password in sentinel at run time, in order to let sentinel connect with replica, we need to drop the previous replica connections. |
|
@hwware Thanks for your guidance, I made some changes and added unit tests based on it. |
| set ::user "testuser" | ||
| set ::password "secret" | ||
|
|
||
| proc setup_sever_auth {} { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a small typo in the proc name
| repl_ri = dictGetVal(de); | ||
| instanceLinkCloseConnection(repl_ri->link, repl_ri->link->cc); | ||
| instanceLinkCloseConnection(repl_ri->link, repl_ri->link->pc); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is a potential memory leak, consider add dictReleaseIterator(di);
|
pinging @yossigo for a second review. |
| assert_equal {OK} [S $id SENTINEL SET mymaster auth-pass $::password] | ||
| } | ||
|
|
||
| after 20000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason that such a long delay is needed? Can the test be optimized instead to have a much shorter down-after-milliseconds and a smaller delay here, or even use wait_for_condition already in this level?
Sentinel tests are already very slow and we should avoid making them slower if it's not absolutely necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yossigo When the value of down-after-milliseconds is greater than SENTINEL_MIN_LINK_RECONNECT_PERIOD, the sentinel with a version lower than 5.0 can pass "auth test".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yossigo I agree with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yossigo Yes, I totally agree with you, and I will create a separated PR for this configuration.
|
Fixed by #10400 |
I always use the following commands to initialize the sentinel configuration:
It worked well before 8631e64, because the send
INFOcommand blocks sendingPING, the condition of this case (https://github.com/redis/redis/blob/6.2.1/src/sentinel.c#L4178) will be satisfied in 15 seconds, the unauthorized connection will be released and the authenticated connection will be reestablished between sentinel and the instance.However, in the current implementation, because of the refresh of
link->last_pong_time, the above case cannot be satisfied, unauthorized connection cannot be released, sentinel thinks the instance issdown.