Skip to content

Conversation

@MeirShpilraien
Copy link

@MeirShpilraien MeirShpilraien commented Dec 9, 2021

The issue this PR comes to solve is the ability to flush the entire data set (including functions which was introduce on #9780).

Why flushing Redis functions can not be added to FLUSHALL?

Flushing the functions on FLUSHALL command can be dangerous because FLUSHALL can be called from within a function and cause the function structures to be freed while the function is running. In addition, Disallow FLUSHALL inside a function will differentiate functions and eval.

The solution

Introducing FLUSHEVERYTHING command that flushes the entire data set, including functions and eval cache. This command is not allowed inside a script (function or eval). The command arguments is just like the FLUSHALL command (sync or async).

The issue this PR comes to solve is the ability to flush the entire
dataset (including functions).

Flushing the functions on FLUSHALL command can be dangerous because
FLUSHALL can be called from within a function and cause the function
structures to be freed while the function is running. In addition,
Disallow FLUSHALL inside a function will differentiate functions and eval.

The only option left is to introducing FLUSHEVERYTHING command that
will flush the entire dataset, including functions.
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@redis/core-team please take a look as share your thoughts

src/db.c Outdated
void flusheverythingCommand(client *c) {
if (flushCommandsInternal(c) == C_ERR) return; /* Error was already sent to the client */
/* Clear functions */
functionsCtxClear(functionsCtxGetCurrent());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suppose we want to flush the EVAL scripts too? in which case maybe we wanna move that into flushAllDataAndResetRDB and control it via some flag?
or alternatively make a dedicated function, and use it also when rdbLoadRio fails?

Copy link
Author

@MeirShpilraien MeirShpilraien Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, pushed a fix and also updated the top comment.

@oranagra oranagra added approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus labels Dec 9, 2021
meir added 2 commits December 9, 2021 14:52
The issue this PR comes to solve is the ability to flush the entire
dataset (including functions).

Flushing the functions on FLUSHALL command can be dangerous because
FLUSHALL can be called from within a function and cause the function
structures to be freed while the function is running. In addition,
Disallow FLUSHALL inside a function will differentiate functions and eval.

The only option left is to introducing FLUSHEVERYTHING command that
will flush the entire dataset, including functions.
@zuiderkwast
Copy link
Contributor

Flushing the functions on FLUSHALL command can be dangerous because FLUSHALL can be called from within a function and cause the function structures to be freed while the function is running.

I don't think it would be that hard to work around this, e.g. if called from withing a function, we can postpone the freeing of the function until the function has returned.

@MeirShpilraien
Copy link
Author

I don't think it would be that hard to work around this

Not sure it will be that simple and not sure it worth it because if we make flushall clean scripts (eval and function) then it could be considered as breaking change (today flushall do not clean the eval scripts cache). If we make flushall clean only functions then it will differentiate eval and functions. So I think its better to introduce this flusheverything command. @oranagra WDYT?

@oranagra
Copy link
Member

I'm not sure.. i do think it's a bit odd to add another flush command where the user can just issue two (flush both data and scripts separately).
On the other hand, it's odd that there's no command that mimics the "flush" that happens when a replica re-syncs.
And one more thing to consider is RM_ResetDataset. which one of these does it represent?

I'm not yet sure which one is better..
I suppose that we'll one day have more things out of the keyspace to flush.
but actually FLUSHALL and RM_ResetDataset are already doing extra things (like saving a new empty RDB file)

@yossigo
Copy link
Collaborator

yossigo commented Dec 12, 2021

I think having FLUSHALL that doesn't flush all and FLUSHEVERYTHING that also flushes scripts and functions but maybe not everything (in the future) is pretty confusing. I'm OK with a dedicated FUNCTION FLUSH command.

If anyone feels strongly about flushing more with a single command, I think adding extra arguments to FLUSHALL to indicate additional elements to flush makes more sense, but it should be explicit to avoid more ambiguity in the future.

The other topic here is flushing from inside a function - I agree with @zuiderkwast we should just delay that and flush when exiting. I guess we'll be blocking the ability to create a function from within a function right?

@madolson
Copy link
Contributor

I also really don't like flusheverything either because it's unclear what it does compared to FLUSHALL, and would also prefer to add a separate command that just handles flushing of functions.

I'm indifferent to adding any type of command that flushes data ands functions, I know we logically are grouping them together but I would expect people to not understand that. For example, someone from the relational world might view functions like schema and would view that independently from data. I would vote for having just FUNCTION FLUSH and if you want to delete everything you call FLUSHALL and FUNCTION FLUSH.

@MeirShpilraien
Copy link
Author

I am Ok with FUNCTION FLUSH. @oranagra @zuiderkwast WDYT? If you agree I will modify the PR.

@oranagra
Copy link
Member

Ok, let it be.. My main concern is that I think the user should have a single command that does the same thing as what the replica does before loading the rdb file, I.e. A reset of the entire data (function are part of the data).

I agree that FLUSHEVRERTHING is confusing and it's hard to understand what's the difference from FLUSHALL, but even if we don't add that new command, it could be confusing, people will wonder if FLUSHALL resets function or not..

@zuiderkwast
Copy link
Contributor

FUNCTION FLUSH 👍

FLUSHALL deletes all keys (like FLUSHDB). ALL refers to the DB numbers here. Simple.

EVAL scripts cache is not data. Users shouldn't assume anything about when it's cleared IMO.

Is pubsub data? (I think messages are data and channels are client state, but) one could imagine that FLUSHEVERYTHING cancels ongoing subscriptions... or even unblock clients blocked on keys, delete ACL users, LATENCY RESET, SLOWLOG RESET, CONFIG RESETSTAT, MEMORY PURGE, unload modules...

@oranagra
Copy link
Member

Ok, we seem to have a consensus on scrapping FLUSHEVERYTHING and adding FUNCTION FLUSH.
@MeirShpilraien please either update the PR and the top comment (keeping the old comment with a crossover?), or maybe in this case open a new one which refers here?

@MeirShpilraien
Copy link
Author

Will open a new PR and refer to this one, closing.

@oranagra oranagra removed state:major-decision Requires core team consensus release-notes indication that this issue needs to be mentioned in the release notes approval-needed Waiting for core team approval to be merged 7.0-not-after-rc1 labels Dec 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants