Scripts that declare the `no-writes` flag are implicitly `allow-oom` too. #10699

oranagra · 2022-05-09T13:35:34Z

Scripts that have the no-writes flag, cannot execute write commands, and since all deny-oom commands are write commands, we now act as if the allow-oom flag is implicitly set for scripts that set the no-writes flag.
this also implicitly means that the EVAL*_RO and FCALL_RO commands can never fails with OOM error.

Note about a bug that's no longer relevant:
There was an issue with EVAL*_RO using shebang not being blocked correctly in OOM state:
When an EVAL script declares a shebang, it was by default not allowed to run in OOM state.
but this depends on a flag that is updated before the command is executed, which was not updated in case of the _RO variants.
the result is that if the previous cached state was outdated (either true or false), the script will either unjustly fail with OOM, or unjustly allowed to run despite the OOM state.
It doesn't affect scripts without a shebang since these depend on the actual commands they run, and since these are only read commands, they don't care for that cached oom state flag.
it did affect scripts with shebang and no allow-oom flag, bug after the change in this PR, scripts that are run with eval_ro would implicitly have that flag so again the cached state doesn't matter.

todo:

update the documentation, and also add a note in the original PR which may be referred to by the release notes. Doc PR: Doc PR: script flags update redis-doc#1954

p.s. this isn't a breaking change since all it does is allow scripts to run when they should / could rather than blocking them.

madolson · 2022-05-09T19:44:35Z

tests/unit/scripting.tcl

+        # Fail to execute regardless of RO script content when we use default flags in OOM condition
+        assert_error {OOM allow-oom flag is not set on the script, can not run it when used memory > 'maxmemory'} {
+            r eval_ro {#!lua flags=
+                return 1
+            } 0
+        }


All write commands are implicitly use-memory, and since they are disallowed in EVAL*_RO I would expect that an EVAL*_RO to never fail on OOM conditions.

that's right. but the shebang flags can define that the script should not be run in OOM condition.
do you think we should ignore that flag in EVAL_RO, FCALL_RO and let the script run even when we're in OOM state?
@MeirShpilraien @yoav-steinberg WDYT?

thinking about it more, a script that declares a shebang, has to declare the no-writes in order to be used in EVAL_RO.
that means that the entity that defines when the script will fail, is the script author, not the caller (i.e. the script code not which command was used).
this is certainly true for functions where the caller doesn't necessarily knows what the script really does.

so i think we should focus on two things:

it's quite clear what should be done with functions, and i suppose we want eval to be consistent with that.

i suppose the only debate here should be about eval scripts that don't declare a shebang at all.

i think these work just fine in both EVAL and EVAL_RO. i'll add a test to demonstrate that.

OOM doesn't really make sense in the context of EVAL_RO scripts though. It seems like an unexpected failure mode to have a script, which doesn't use memory, to fail on OOM conditions if you add a shebang. Can't we just document that these two modes are different?

document what?
that in both EVAL_RO and FCALL_RO, the default flags are ignored and the script is executed as if it declared allow-oom?

would you consider doing the same for no-writes?
i.e. user will be able to call a function that didn't declare no-writes by using FCALL_RO?
the script would start running, and then abort with error if it uses any write command?

you mean that any attempt to run a function with FCALL_RO should fail unless it is marked with allow-oom (even when we're not in OOM state)?

i think that would be suitable for a command named FCALL_OOM, buy luckily we don't have / need one.
i kinda regret adding that command in the first place (EVAL_RO). if it's sole purpose is for client routing to replicas, and the grantee that no matter the script it cannot do any writes, i don't see why it should require the allow-oom flag.

can you explain again what's the benefit of doing that?

Maybe a simpler idea is that no-write implies allow-oom? I'm not really aware of a case where a non-write command would consume additional memory, so there isn't really a reason to block in OOM scenarios. Since OOM scenarios are infrequent, I would rather guide users to setting the right flags so their application works as expected in OOM scenarios.

we discussed that in a core-team meeting and concluded we wanna proceed with this idea (no-write to imply allow-oom), i'll make a separate PR for that and we can further discuss it there.

on a second thought, although that change is completely unrelated to the other fix in this PR, effects and the tests might be related, so let's keep the discussion here.

So some background:
in #10066 we added the no-write and allow-oom flags, i seem to remember that we did consider setting allow-oom implicitly when no-write is set, it was discussed here #10066 (comment), and was rejected because i wanted similar flags and behavior as normal redis commands and module commands (to avoid confusion).
but later we negated the flags so we have different defaults, so maybe we should reconsider that (despite being a breaking change comparing to redis 7.0.0)

@MeirShpilraien do you remember any other detail?

For the record, i discussed it with Meir.
It seems we clearly stated in the original PR top comment that:

if the function will not turn on this flag it will not be possible to run it if OOM reached (even if the function declares no-writes and even if fcall_ro is used). If this flag is set, any command will be allow on OOM (even those that is marked with CMD_DENYOOM)

But looking back at the original discussions, we can't spot any reason not to revert that decision and have no-writes implicitly count as allow-oom too.

i decided to keep the flags separate and test both instead of set one when the other is set. The tests i added for the bug in server.c not updating the server.script_oom flag are no longer relevant since that state is now impossible to test.

oranagra · 2022-05-16T10:14:10Z

updated the code, and top comment.
note that the original bug in this PR is no longer reachable (see top comment)

madolson

LGTM

…too. (redis#10699) Scripts that have the `no-writes` flag, cannot execute write commands, and since all `deny-oom` commands are write commands, we now act as if the `allow-oom` flag is implicitly set for scripts that set the `no-writes` flag. this also implicitly means that the EVAL*_RO and FCALL_RO commands can never fails with OOM error. Note about a bug that's no longer relevant: There was an issue with EVAL*_RO using shebang not being blocked correctly in OOM state: When an EVAL script declares a shebang, it was by default not allowed to run in OOM state. but this depends on a flag that is updated before the command is executed, which was not updated in case of the `_RO` variants. the result is that if the previous cached state was outdated (either true or false), the script will either unjustly fail with OOM, or unjustly allowed to run despite the OOM state. It doesn't affect scripts without a shebang since these depend on the actual commands they run, and since these are only read commands, they don't care for that cached oom state flag. it did affect scripts with shebang and no allow-oom flag, bug after the change in this PR, scripts that are run with eval_ro would implicitly have that flag so again the cached state doesn't matter. p.s. this isn't a breaking change since all it does is allow scripts to run when they should / could rather than blocking them.

fix issue with EVAL*_RO using shebang not being blocked in OOM state

76ca648

madolson reviewed May 9, 2022

View reviewed changes

oranagra added 3 commits May 10, 2022 09:52

add a test that sohws eval_ro being wrongly blocked too

f246b62

more tests for shebang-less scripts in OOM state

cd26689

The no-writes flag now implies allow-oom

f09a256

i decided to keep the flags separate and test both instead of set one when the other is set. The tests i added for the bug in server.c not updating the server.script_oom flag are no longer relevant since that state is now impossible to test.

oranagra changed the title ~~fix issue with EVAL*_RO using shebang not being blocked correctly in OOM state~~ Scripts that declare the no-writes flag are implicitly allow-oom too. May 16, 2022

oranagra requested review from MeirShpilraien and madolson May 16, 2022 10:08

update outdated functions test

ce04587

oranagra added release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus labels May 16, 2022

madolson approved these changes May 16, 2022

View reviewed changes

eduardobr mentioned this pull request May 17, 2022

[DOC] Misleading info about execution of scripts against read-only replicas redis/redis-doc#1948

Closed

This was referenced May 22, 2022

Function Flags support (no-writes, no-cluster, allow-state, allow-oom) #10066

Merged

script flags update redis/redis-doc#1954

Merged

oranagra merged commit b0e18f8 into redis:unstable May 22, 2022

oranagra deleted the eval_ro_oom branch May 22, 2022 13:03

oranagra mentioned this pull request Jun 8, 2022

Release 7.0.1 #10829

Merged

Scripts that declare the no-writes flag are implicitly allow-oom too. #10699

Scripts that declare the no-writes flag are implicitly allow-oom too. #10699

Uh oh!

Conversation

oranagra commented May 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

todo:

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oranagra May 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oranagra commented May 16, 2022

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Scripts that declare the `no-writes` flag are implicitly `allow-oom` too. #10699

Scripts that declare the `no-writes` flag are implicitly `allow-oom` too. #10699

oranagra commented May 9, 2022 •

edited

Loading

oranagra May 16, 2022 •

edited

Loading