Skip to content

Conversation

@ShooterIT
Copy link
Member

Now we have RDB channel in #13732, child process can transfer RDB in a background method, instead of handled by main thread. So when redis-cli gets RDB from server, we can adopt this way to reduce the main thread load.

@tezc
Copy link
Collaborator

tezc commented Feb 17, 2025

Okay, this is interesting. Normally, slave will send replconf rdb-channel-repl to indicate it's capable of rdbchannel repl. Then, when it opens rdb channel connection, it sends replconf rdb-channel over the new connection. Now, slave is bypassing first step and just sending rdb-channel to trigger delivery in the fork.

Two things come to my mind,

  • We are missing some sort of validation for replconf rdb-channel on master. Probably, we should check if server.repl_rdb_channel and server.repl_diskless_sync enabled, similar to replconf rdb-channel-repl. Otherwise, slave will bypass the config. I think we should prevent it.

  • Probably, master can decide itself whether it wants to deliver rdb in the fork if slave is rdb-only. Master doesn't need to receive repl rdb-channel. If server.repl_rdb_channel and server.repl_diskless_sync are enabled, when slave is rdb-only, then we may deliver in the fork. There is no change from slave POV. Not sure if this requires a bit of work. Just an idea, we may ignore it for now and consider in the future.

@ShooterIT wdyt?

@ShooterIT
Copy link
Member Author

@tezc thank you

  • yes, i found i missed server.repl_rdb_channel. only if both server.repl_rdb_channel and server.repl_diskless_sync are enabled, we allow rdb channel. But i have a question, these configs are changed after replying RDBCHANNELSYNC to main channel, it seems we don't check when replicas ask the rdb channel, right?

  • makes sense, initially i want to avoid changing core code, but after second thoughts, we may also consider replconf rdb-channel is set when receiving replconf rdb-only, just set some flags, it should be simple.

@ShooterIT
Copy link
Member Author

slave will bypass the config. I think we should prevent it.

it seems we don't have mechanism to block this way(slave sends replconf rdb-channel directly). i saw
Starting BGSAVE for SYNC with target: disk (rdb-channel) log when i disabled server.repl_diskless_sync but sent replconf rdb-channel 1

@tezc
Copy link
Collaborator

tezc commented Feb 18, 2025

But i have a question, these configs are changed after replying RDBCHANNELSYNC to main channel, it seems we don't check when replicas ask the rdb channel, right?

You mean after replying +RDBCHANNELSYNC, if config changes to "disabled", master will still accept "replconf rdb-channel" ? Yes, currently, this is how it works. We are missing config checks.

Starting BGSAVE for SYNC with target: disk (rdb-channel) log when i disabled server.repl_diskless_sync but sent replconf rdb-channel 1

Okay, good point. Looks like we need to revisit this. Also, config change in between makes it a bit complicated.

One extra scenario: After receiving +RDBCHANNELSYNC, slave sends replconf rdb-channel 1. Let's say we added config check and configs are enabled. So, master accepted replconf rdb-channel 1. While waiting bgsave to start on master, server.repl_diskless_sync gets disabled before entering startBgsaveForReplication(). Now, we still have the same problem. A rdbchannel slave waiting bgsave but we are not going to write directly to the sockets. Though, not sure if there is any harm in doing that from POV of slave. Logs on master side are confusing but from functionality POV, I guess it will work fine. Do you see any issues around here?

@ShooterIT
Copy link
Member Author

I don't find issues, both master and replicas works well when disk-based.

slave will bypass the config. I think we should prevent it.

maybe we don't need to prevent this way, we don't change protocol on rdb channel which is the same as full resynchronization as before. rdb channel means we deliver rdb on this connection, not must diskless, right? of course, diskless is more efficient.

@tezc
Copy link
Collaborator

tezc commented Feb 18, 2025

I just have one concern, currently it applies to diskless repl. Assume for some reason (maybe we realized a bug) and we decide not to use delivery from fork. So, we set "repl-rdb-channel" config disabled on master. Now, if slave directly sends "replconf rdb-channel 1", master will still deliver it from fork. It feels like slave is controlling master's behavior, basically it overrides config on master. So, for this reason, maybe adding a simple config check to "replconf rdb-channel 1" is a good idea. Maybe, we should not care at all.. Wdyt?

@ShooterIT
Copy link
Member Author

It feels like slave is controlling master's behavior, basically it overrides config on master.

I think your concern makes sense. when we check server.repl_rdb_channel, we should close the connection if server.repl_rdb_channel is disabled and replica asks replconf rdb-channel 1, right?

@tezc
Copy link
Collaborator

tezc commented Feb 19, 2025

I think master can just ignore, it wouldn't set SLAVE_REQ_RDB_CHANNEL for the slave. No need to close the connection. Server can still deliver the rdb.

@ShooterIT
Copy link
Member Author

ok, added check.

now we try best to avoid disk (rdb-channel), in theory we do not prevent this possibility, but even if it occurs, it won’t lead to any issue, right?

@tezc
Copy link
Collaborator

tezc commented Feb 20, 2025

but even if it occurs, it won’t lead to any issue, right?

Yes, it should work.

Btw, still, I feel like we better let master to decide whether it wants to deliver RDB in the fork (not requiring client to send "replconf rdbchannel 1"). This way, we also avoid error message when cli is run against older redis versions. Currently, older redis versions will return error to "replconf rdbchannel 1". I see it is not fatal but we'll print an error message, it may be confusing.

@ShooterIT
Copy link
Member Author

so actually we don't need to provide replconf rdbchannel 0|1, right?

@tezc
Copy link
Collaborator

tezc commented Feb 20, 2025

so actually we don't need to provide replconf rdbchannel 0|1, right?

Yes.

Master can decide itself whether it wants to use this optimization. e.g. if "rdb-channel-repl" is enabled, it delivers directly from fork when slave is "rdb-only". Slave does not have to declare "replconf rdbchannel 1".

@ShooterIT
Copy link
Member Author

Hi @tezc WDYT 3419fd3?

Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
@ShooterIT ShooterIT added the state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten label Mar 29, 2025
@ShooterIT ShooterIT merged commit 6a436b6 into redis:unstable May 8, 2025
18 checks passed
@ShooterIT ShooterIT deleted the cli-rdb branch May 8, 2025 00:47
@collinfunk
Copy link

@ShooterIT Seems like there was an empty rdb file /tmp accidentally committed in this?

@ShooterIT
Copy link
Member Author

oh, yes, @collinfunk thank you, let me remove it

ShooterIT added a commit that referenced this pull request May 9, 2025
#13809 accidentally submitted a tmp rdb file, so delete it
@collinfunk
Copy link

@ShooterIT Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants