Skip to content

Conversation

@guybe7
Copy link
Collaborator

@guybe7 guybe7 commented Jan 13, 2021

Edit: the interfaces here later changed, see #10056

Fix #7297

The problem:

Today, there is no way for a client library or app to know the key name indexes for commands such as ZUNIONSTORE/EVAL and others with "numkeys", since COMMAND INFO returns no useful info for them.

For cluster-aware redis clients, this requires to 'patch' the client library code specifically for each of these commands or to resolve each execution of these commands with COMMAND GETKEYS.

The solution:

Introducing key specs other than the legacy "range" (first,last,step)

The 8th element of the command info array, if exists, holds an array of key specs. The array may be empty, which indicates the command doesn't take any key arguments or may contain one or more key-specs, each one may leads to the discovery of 0 or more key arguments.

A client library that doesn't support this key-spec feature will keep using the first,last,step and movablekeys flag which will obviously remain unchanged.

A client that supports this key-specs feature needs only to look at the key-specs array. If it finds an unrecognized spec, it must resort to using COMMAND GETKEYS if it wishes to get all key name arguments, but if all it needs is one key in order to know which cluster node to use, then maybe another spec (if the command has several) can supply that, and there's no need to use GETKEYS.

Each spec is an array of arguments, first one is the spec name, the second is an array of flags, and the third is an array containing details about the spec (specific meaning for each spec type)
The initial flags we support are "read" and "write" indicating if the keys that this key-spec finds are used for read or for write. clients should ignore any unfamiliar flags.

In order to easily find the positions of keys in a given array of args we introduce keys specs. There are two logical steps of key specs:

  1. start_search: Given an array of args, indicate where we should start searching for keys
  2. find_keys: Given the output of start_search and an array of args, indicate all possible indices of keys.

start_search step specs

  • index: specify an argument index explicitly
    • index: 0 based index (1 means the first command argument)
  • keyword: specify a string to match in argv. We should start searching for keys just after the keyword appears.
    • keyword: the string to search for
    • start_search: an index from which to start the keyword search (can be negative, which means to search from the end)

Examples:

  • SET has start_search of type index with value 1
  • XREAD has start_search of type keyword with value [“STREAMS”,1]
  • MIGRATE has start_search of type keyword with value [“KEYS”,-2]

find_keys step specs

  • range: specify [count, step, limit].
    • lastkey: index of the last key. relative to the index returned from begin_search. -1 indicating till the last argument, -2 one before the last
    • step: how many args should we skip after finding a key, in order to find the next one
    • limit: if count is -1, we use limit to stop the search by a factor. 0 and 1 mean no limit. 2 means ½ of the remaining args, 3 means ⅓, and so on.
  • “keynum”: specify [keynum_index, first_key_index, step].
    • keynum_index: is relative to the return of the start_search spec.
    • first_key_index: is relative to keynum_index.
    • step: how many args should we skip after finding a key, in order to find the next one

Examples:

  • SET has range of [0,1,0]
  • MSET has range of [-1,2,0]
  • XREAD has range of [-1,1,2]
  • ZUNION has start_search of type index with value 1 and find_keys of type keynum with value [0,1,1]
  • AI.DAGRUN has start_search of type keyword with value [“LOAD“,1] and find_keys of type keynum with value [0,1,1] (see https://oss.redislabs.com/redisai/master/commands/#aidagrun)

Note: this solution is not perfect as the module writers can come up with anything, but at least we will be able to find the key args of the vast majority of commands.
If one of the above specs can’t describe the key positions, the module writer can always fall back to the getkeys-api option.

Some keys cannot be found easily (KEYS in MIGRATE: Imagine the argument for AUTH is the string “KEYS” - we will start searching in the wrong index).
The guarantee is that the specs may be incomplete (incomplete will be specified in the spec to denote that) but we never report false information (assuming the command syntax is correct).
For MIGRATE we start searching from the end - startfrom=-1 - and if one of the keys is actually called "keys" we will report only a subset of all keys - hence the incomplete flag.
Some incomplete specs can be completely empty (i.e. UNKNOWN begin_search) which should tell the client that COMMAND GETKEYS (or any other way to get the keys) must be used (Example: For SORT there is no way to describe the STORE keyword spec, as the word "store" can appear anywhere in the command).

We will expose these key specs in the COMMAND command so that clients can learn, on startup, where the keys are for all commands instead of holding hardcoded tables or use COMMAND GETKEYS in runtime.

Comments:

  1. Redis doesn't internally use the new specs, they are only used for COMMAND output.
  2. In order to support the current COMMAND INFO format (reply array indices 4, 5, 6) we created a synthetic range, called legacy_range, that, if possible, is built according to the new specs.
  3. Redis currently uses only getkeys_proc or the legacy_range to get the keys indices (in COMMAND GETKEYS for example).

"incomplete" specs:
the command we have issues with are MIGRATE, STRALGO, and SORT
for MIGRATE, because the token KEYS, if exists, must be the last token, we can search in reverse. it one of the keys is actually the string "keys" will return just a subset of the keys (hence, it's "incomplete")
for SORT and STRALGO we can use this heuristic (the keys can be anywhere in the command) and therefore we added a key spec that is both "incomplete" and of "unknown type"

if a client encounters an "incomplete" spec it means that it must find a different way (either COMMAND GETKEYS or have its own parser) to retrieve the keys.
please note that all commands, apart from the three mentioned above, have "complete" key specs

@guybe7 guybe7 force-pushed the keys_spec branch 2 times, most recently from e153acf to 7b73d7a Compare January 13, 2021 13:08
@oranagra
Copy link
Member

@guybe7 for the 3 bugfixes you found (ZDIFFSTORE etc, mentioned above), i think it would be a good idea to issue a separate PR, so we can cherry pick that fix to 6.0.

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guybe7 i assume you compared the first 7 fields of COMMAND output to the previous version and saw that their identical for all command (except for the few which were buggy). right?

p.s. did you run valgrind?

@oranagra
Copy link
Member

@redis/core-team please approve the new COMMAND output fields, see example in the issue, and the new module API

@oranagra oranagra added state:major-decision Requires core team consensus state:needs-doc-pr requires a PR to redis-doc repository release-notes indication that this issue needs to be mentioned in the release notes labels Jan 17, 2021
itamarhaber
itamarhaber previously approved these changes Jan 20, 2021
Copy link
Member

@itamarhaber itamarhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve of the approach - haven't done a CR though

@oranagra oranagra added approval-needed Waiting for core team approval to be merged state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten labels Jan 20, 2021
@guybe7
Copy link
Collaborator Author

guybe7 commented Jan 22, 2021

last commit handles the above comments by @yossigo + add the "startfrom" arg to the keyword spec(see top comment)

@oranagra
Copy link
Member

@redis/core-team note the changes:

  1. a new argument for the keyword spec, named startfrom
  2. a new keyspec flag named incomplete.
    see the top comment about MIGRATE command.

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a high level of question of why do we have flags related to the keyspecs? I was meandering through some clients and I couldn't figure out where you would differentiate the read/write path. I thought about client side caching, but this doesn't give us enough introspection to use it for that case. It doesn't add a lot of overhead, but unclear why we want to maintain it.

I'm good with the change as long as we've thought through why we're sending extra information to clients.

@oranagra
Copy link
Member

@madolson i don't have a clear use case for the read/write flags for the keyspecs, but i think we should make that distinction, i'm sure it'll come in handy, even inside redis one day.
this is in part related to the lookupKeyRead vs lookupKeyWrite saga, and in part lesson learned from Redis On Flash.
is is good to know which key is gonna be only read from, which one will be just overwritten, and which one will be modified (both read and write).

maintaining this info in the command table is not a huge overhead, and since for each command we always know what each key argument is gonna be used for, there's never any doubt as to what to fill in that field.

@madolson
Copy link
Contributor

Sure, I guess my concern is we're optimistically building out this functionality without a clear understanding of how it will be used. The key location use case, which is well understand for cluster mode and ACLs, doesn't really need it.

The one way decision is exposing this to clients. Even if we have use cases inside Redis, it may not be relevant to clients. The API is solid though, we could always introduce new specs/deprecate old ones. The risk is small is small in the end, I'm just not convinced we should release in for 6.2.

@yossigo
Copy link
Collaborator

yossigo commented Jan 28, 2021

@oranagra @guybe7 and myself got to discuss this a bit more. In a nutshell, we conclude that:

  • We can go through another round of improvements to the keyspec proposal above to make it much more flexible and support more command patterns, without increasing complexity much
  • Considering all the different modules out there, we won't get 100% coverage unless we come up with full command grammar. This will require full parsing - much more complex
  • There is still some uncertainty about the value of this, we need to hear clearer voices from the client ecosystem about how useful this is going to be

Calling @mgravell @andymccurdy @sazzad16 @mp911de @luin and other Redis client library developers to provide their perspective. TL;DR the idea is to extend COMMAND INFO significantly to describe more/most/all command patterns so clients could extract key names from command arguments autonomously, and we need to iterate the design and get some validation of the value in doing that.

@mp911de
Copy link

mp911de commented Jan 28, 2021

Lettuce doesn't use COMMAND INFO for key/argument patterns at all. We actually use it only to determine which commands are available (command names). The key/argument pattern is so complicated that it's easier for us to just declare methods statically instead of assembling commands from COMMAND INFO.

@oranagra
Copy link
Member

@mp911de thanks.

  1. so that means that each time a new command is added you have to implement a to extract the keyname arguments?
  2. forgive me for my ignorance (and laziness), does Lettuce supports running arbitrary commands (like execute_command in redis-py)?
    if that's the case, then you want to let users use commands that Lettuce doesn't support yet (or module commands), so using COMMAND GETKEYS or the new keyspecs idea can solve that.
  3. would you consider using the new keyspecs when available? do they look usable to you?
    it can simplify your work, but you'll need to keep the old code in case the redis server that's used doesn't support them.

@mp911de
Copy link

mp911de commented Jan 28, 2021

  1. so that means that each time a new command is added you have to implement a to extract the keyname arguments?

Exactly. This more due to the fact that with Java you kind of expect a method declaration to assist users. Aside from that, we have a method to dispatch plain commands (dispatch("SET", "KEY", "VALUE")-style). Based on that, the user decides what arguments come into a command and we don't have the immediate need to know which argument is a key or a keyword.

  1. would you consider using the new keyspecs when available? do they look usable to you?

We have a parser for the COMMAND response that returns a value object to the caller. We would certainly add the keyspec there for the sake of having complete support for the response but we don't have any use for it.

@oranagra
Copy link
Member

@mp911de so with dispatch (which would be needed to use new command or module commands) how do you know which cluster node to forward the command? (using either COMMAND GETKEYS or relying on the -MOVED would be an overhead).

@mp911de
Copy link

mp911de commented Jan 28, 2021

Excuse my oversimplified representation of our API. The actual code to use dispatch is:

dispatch(MyCommands.SET, new StatusOutput<>(StringCodec.UTF8), new CommandArgs<>(StringCodec.UTF8).addKey(key).addValue(value));

CommandArgs captures which argument is a key and which one is a value, mostly to apply the correct encoding. That also allows us to reason about the command argument ordering.

@guybe7 guybe7 requested a review from oranagra September 2, 2021 16:29
@oranagra
Copy link
Member

oranagra commented Sep 2, 2021

@redis/core-team please approve again. quite a lot have changed since last time.
i think the main difference is that we've split the specs into two steps (start_search and find_keys), and changed the module apis. the top comment was updated so please have a look at it.

i suggest to focus on reviewing the module api, and the output of COMMAND command.

note that the changes to COMMAND command are gonna be reverted soon (that's why we don't document it yet), and instead will be probably put into a new COMMANDS command (the old command will only be kept for backwards compatibility, and the one one will have info on sub-commands).
also note that the command table is gonna be changed soon to a json file, so you can ignore the format there.

@oranagra oranagra added the approval-needed Waiting for core team approval to be merged label Sep 2, 2021
@oranagra oranagra added the state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten label Sep 9, 2021
@itamarhaber
Copy link
Member

Approving the concept and APIs.

Copy link
Collaborator

@yossigo yossigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Re-reading this after several months, I realize we may need to include pseudo code for the key spec handling, or better yet - provide a reference implementation.

@oranagra
Copy link
Member

running daily CI before merging: https://github.com/redis/redis/actions/runs/1232825354

@oranagra
Copy link
Member

@guybe7 valgrind found some issues in the module API: https://github.com/redis/redis/runs/3596355820?check_suite_focus=true
and some test failure on alpine linux: https://github.com/redis/redis/runs/3596356192?check_suite_focus=true

@oranagra oranagra merged commit 03fcc21 into redis:unstable Sep 15, 2021
oranagra added a commit that referenced this pull request Jan 18, 2022
…DELETE) (#10122)

The new ACL key based permissions in #9974 require the key-specs (#8324) to have more
explicit flags rather than just READ and WRITE. See discussion in #10040

This PR defines two groups of flags:
One about how redis internally handles the key (mutually-exclusive).
The other is about the logical operation done from the user's point of view (3 mutually exclusive
write flags, and one read flag, all optional).
In both groups, if we can't explicitly flag something as explicit read-only, delete-only, or
insert-only, we flag it as `RW` or `UPDATE`.
here's the definition from the code:
```
/* Key-spec flags *
 * -------------- */
/* The following refer what the command actually does with the value or metadata
 * of the key, and not necessarily the user data or how it affects it.
 * Each key-spec may must have exaclty one of these. Any operation that's not
 * distinctly deletion, overwrite or read-only would be marked as RW. */
#define CMD_KEY_RO (1ULL<<0)     /* Read-Only - Reads the value of the key, but
                                  * doesn't necessarily returns it. */
#define CMD_KEY_RW (1ULL<<1)     /* Read-Write - Modifies the data stored in the
                                  * value of the key or its metadata. */
#define CMD_KEY_OW (1ULL<<2)     /* Overwrite - Overwrites the data stored in
                                  * the value of the key. */
#define CMD_KEY_RM (1ULL<<3)     /* Deletes the key. */
/* The follwing refer to user data inside the value of the key, not the metadata
 * like LRU, type, cardinality. It refers to the logical operation on the user's
 * data (actual input strings / TTL), being used / returned / copied / changed,
 * It doesn't refer to modification or returning of metadata (like type, count,
 * presence of data). Any write that's not INSERT or DELETE, would be an UPADTE.
 * Each key-spec may have one of the writes with or without access, or none: */
#define CMD_KEY_ACCESS (1ULL<<4) /* Returns, copies or uses the user data from
                                  * the value of the key. */
#define CMD_KEY_UPDATE (1ULL<<5) /* Updates data to the value, new value may
                                  * depend on the old value. */
#define CMD_KEY_INSERT (1ULL<<6) /* Adds data to the value with no chance of,
                                  * modification or deletion of existing data. */
#define CMD_KEY_DELETE (1ULL<<7) /* Explicitly deletes some content
                                  * from the value of the key. */
```

Unrelated changes:
- generate-command-code.py is only compatible with python3 (modified the shabang)
- generate-command-code.py print file on json parsing error
- rename `shard_channel` key-spec flag to just `channel`.
- add INCOMPLETE flag in input spec of SORT and SORT_RO
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval-needed Waiting for core team approval to be merged release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus state:needs-doc-pr requires a PR to redis-doc repository state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

A better approach for COMMAND INFO for movablekeys commands

7 participants