-
Notifications
You must be signed in to change notification settings - Fork 24.4k
A new propagation mechanism for Lua scripts #4966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
repl_scriptcache_dict 字典记录了自己已经将哪些脚本传播给了所有服务器 |
@cqyisbug if (evalsha && !server.lua_replicate_commands) {
if (!replicationScriptCacheExists(c->argv[1]->ptr)) {
/* This script is not in our script cache, replicate it as
* EVAL, then add it into the script cache, as from now on
* slaves and AOF know about it. */
robj *script = dictFetchValue(server.lua_scripts,c->argv[1]->ptr);
replicationScriptCacheAdd(c->argv[1]->ptr);
serverAssertWithInfo(c,NULL,script != NULL);
rewriteClientCommandArgument(c,0,
resetRefCount(createStringObject("EVAL",4)));
rewriteClientCommandArgument(c,1,script);
forceCommandPropagation(c,PROPAGATE_REPL|PROPAGATE_AOF);
}
} |
|
@soloestoy thanks for your advice. ^_^ |
|
Moreover , if we apply But it's a waste of time that a slave execute |
|
Ping @antirez |
After RDB version 9, the RDB file not only contains replication information, but alos contains the script cache as well, so we can guarantee that no matter PSYNC after a restart or FULLRESYNC a new slave connect, the slave can hold all the same scripts with master, so we can remove the replication script cache. Before this, redis only propagate scripts by SCRIPT LOAD, and by EVAL command implicitly(only when EVAL update data). Now we wanna make the script propagated explicitly, in other words we propagate all the scripts by SCRIPT LOAD even the script is in EVAL command. That can make master, slave and AOF be consistent, redis can execute EVALSHA on master or slave or restart from AOF.
d3265d9 to
1af44c1
Compare
|
Hello @soloestoy, I've to study the PR, just had time to scan it when you posted it initially: my impression is that this PR attempts to prevent users from doing Very Silly Things (TM) :-) Like going and poke with the scripting cache of a slave, by doing SCRIPT FLUSH/LOAD in a slave. In general as you can see in all the Redis design, we don't try to avoid that users can shoot themselves in the feet... So that would not be a good fit. However I've to read the code and evaluate the PR with more care, this is just to give a general feedback about this kind of changes. Thanks. |
|
Hi @antirez , thanks for your reply.
I totally agree with you, I don't care users' wrong and silly behavior neither. My main idea is to implement a new propagation for Lua scripts, the current Lua scripts propagation is a bit complicated and there is a waste of memory and cpu time. Firstly, let's see the current Lua scripts propagation mechanism(here we ignore
But remember that we already persist all Lua scripts into RDB file after Redis 4.0, so I'm thinking can we implement a new propagation for Lua scripts to make it simple and effective? I think maybe Lua scripts should be regarded as a part of state of the instance as well. We can take the Lua scripts as data that not belong to any DB but belong to the whole instance. And then what I did is that:
|
We can guarantee that the script in EVAL command has already been propagated.
It is useful to get the original script which represented by sha1 in slowlog.
9f24cdf to
042e660
Compare
|
Hello @soloestoy, I examined your proposal. In general those are good ideas, however given that we now have a plan to avoid all this complexity anyway, and switch to a feature where the only propagation mechanism is effects replication of scripts, people that want to use scripts in slaves will have to follow the protocol as usually... that is: EVALSHA and if the server replies with a no script error, use EVAL instead. So all the instances, including slaves, are unrelated to all the other instances from the POV of the scripts they have in memory. This is in general a very good simplification. However, there is one of the points in your proposal that IMHO is a serious bug, and we should address it ASAP, regardless of the fact we are going to remove all this code in the future (hopefully), and is that read-only scripts, or aborted scripts (scripts with writes that abort before the first write is ever executed, so they are effectively read-only), are propagated anyway. I believe that we should instead translate such scripts as The above code, does not attempt to do any differentiation between scripts that produced changes, and scripts that instead were completely read-only. So my suggestion is that we do such change for now ignoring all the rest, and go forward with our plan to enable effects replication by default (with some hidden config to revert it if really needed). If such plan fails, and we see in the practice that users really need verbatim scripts replication, then we can re-examine the issue and understand what's the best approach to propagate the scripts, starting from your proposal. Does this make sense to you? Thanks for all your great work in this issue (and otherwise!). |
|
Thanks your reply @antirez , I will analyze the bug you pointed carefully. I agree with that we can go forward to enable effects replication by default, at the same time I'm trying to find a good way to propagate the verbatim scripts, it's not finished yet, maybe can finish these days. |
|
no longer relevant, script propagation was deprecated in #9812 |
Hi, @antirez
Happy to see 5.0 RC1 released, it's a big step, redis becomes more strong and interesting, look forward to the GA version.
From the last commit, we are aligned that RDB file does no longer represent just a persistence file, it represents also the state of the instance, so we persist all keys even expired to RDB file.
It looks good after the fix, but I just remember that we have met a similar problem about lua scripts once. After psync slave may lose some scripts because the rebooting RDB file doesn't contains, and we fix it by persisting all lua scripts to RDB file when FULL RESYNC. Lua scripts should be regarded as a part of state of the instance as well I think.
But there still have something bad which may result in inconsistency between master and slave:
If slave accept
script flushfrom a normal client, slave cannot executeEVALSHAcorrectly.It's easy to understand and reproduce, because after
script flushslave delete all lua scripts.SCRIPT LOADon middle slave may lead to inconsistency.This scenario is a little complex but serious, imagine that:
script load "redis.call('set','foo','bar')"on middle slave B, we can get the sha1a3862cfaeddaf7c3473803cba58cdd0bdca53397evalsha a3862cfaeddaf7c3473803cba58cdd0bdca53397 0on slave B, we will get an error because this script contains write command. And slave B will add this script toserver.repl_scriptcache_dict, but not replicate to slave C, because all data in replication stream are generated by master.evalsha a3862cfaeddaf7c3473803cba58cdd0bdca53397 0again. Now master B has keyfoobut slave C doesn't.These bugs appear because
script flush/loadandevalcommand break the lua scripting state, we can fix them like this:Disable
script flush/loadandevalcommand on slave.Then slave has no chance to break lua scripting state.
For
EVALcommand, we can propagate scripts explicitly.It means that we can split
EVALcommand toSCRIPT LOADandEVAL, if it's the first time we meet this script.Because some readonly scripts or scripts with
redis.replicate_command()cannot be propagate to slave and aof, so after this all slaves can be consist with master.After these two above finished, we can remove replication script cache.
And consider aofrewrite without replication script cache, we should persist all lua scripts in AOF and RDB file.