Skip to content

Redis as the SSOT of its commands #9359

@guybe7

Description

@guybe7

Intro

Currently, the information Redis provides in COMMAND is limited and incomplete, causing proxy/client writers to seek missing information in other places.
The goal is to make Redis the single source of truth when it comes to information about its commands.

Steps:

  1. Some commands have no purpose without specifying a sub-command (e.g. CONFIG, CLIENT, MEMORY, etc.). We would like to treat such sub-commands as independent commands. It’ll help with the other issues and will make everything cleaner in general (for example, different loading flag for CONFIG GET vs. CONFIG SET, different ACL categories for CLIENT KILL vs. CLIENT SETNAME). PR: Treat subcommands as commands #9504
  2. Redis clients (e.g. redis-py) need to know, when parsing a command, where the key arguments are (to know to which cluster node to send a command in cluster mode). Currently, COMMAND can help only if the key positions are simple (basically it’s a range of indexes, with a key-step). While that works for some commands (SET, LPUSH, etc.) it doesn’t work for commands which have a more complicated scheme (XADD, ZUNION) so clients either have a hardcoded way to handle “complex” commands or they have to COMMAND GETKEYS (round trip). PR: A better approach for COMMAND INFO for movablekeys commands #8324
  3. Proxies need to know some information about the command (e.g. PING should be sent to all shards while GET is sent only to one shard). More information in the “Command tips” section.
  4. Both redis.io and help.h rely on commands.json because Redis doesn’t provide some information (command usage, first version it was released, short description, complexity, etc.)

The goal is to have the per-command JSON as the single source of truth, containing everything there is to know about a command. We will generate code (commands.json for radis.io, the command table for Redis, etc.) using these files.

Subcommands as commands

If a command is meaningless without a subcommand (CONFIG, MEMORY, and others) it may suggest that we’re actually talking about a few separate logical commands that happen to have a common theme.
These subcommands need to have a separate entry in the command table with their own unique flags and ACL categories.

A list of such commands:

CONFIG
COMMAND
MEMORY
SLOWLOG
OBJECT
SCRIPT
DEBUG
XGROUP
XINFO
CLIENT
CLUSTER
ACL
STRALGO
MODULE
LATENCY

COMMAND command

COMMAND has the following issues:

  1. All the commands above, excluding COMMAND, have no meaning with a subcommand. That means we will need to represent both COMMAND and COMMAND INFO, etc. in the command table.
  2. We would like to eliminate the above commands (except COMMAND) from the command table altogether (i.e. CONFIG by itself is meaningless). We can’t do that because we have to maintain backward-compatibility of COMMAND (i.e. COMMAND INFO config should work)

COMMAND’s output (with no args) will keep the old format, with entries to all commands for backward compatibility but we will try to deprecate it.
We will add a new command COMMAND LIST, that’ll return the list of commands in a new format, which will include all the features described in this document.

The idea is that each such command will have an array of sub-commands. Once the server finds a hit on argv[0] as the command name it needs to check whether this command has subcommands and try to match argv[1]. It means that we are no longer talking about a command table, we’re talking about a command tree.

Key positions

In order to easily find the positions of keys in a given array of args we introduce keys specs. There are two logical steps of key specs:

  1. start_search: Given an array of args, indicate where we should start searching for keys
  2. find_keys: Given the output of start_search and an array of args, indicate all possible indices of keys.

start_search step specs

  • index: specify an index explicitly
  • keyword: specify a string to match in argv. We should start searching for keys just after the keyword appears. Another property for this spec is an index from which to start the keyword search (can be negative, which means to search from the end)

Examples:

  • SET has start_search of type index with value 1
  • XREAD has start_search of type keyword with value [“STREAMS”,1]
  • MIGRATE has start_search of type keyword with value [“KEYS”,-2]

find_keys step specs

  • range: specify [count, step, limit].
    • count: number of expected keys. indicating till the last argument, -2 one before the last
    • step: how many args should we skip after finding a key, in order to find the next one
    • limit: if count is -1, we use limit to stop the search by a factor. 0 and 1 mean no limit. 2 means ½ of the remaining args, 3 means ⅓, and so on.
  • “keynum”: specify [keynum_index, first_key_index, step]. Note: keynum_index is relative to the return of the start_search spec. first_key_index is relative to keynum_index.

Examples:

  • SET has range of [1,1,0]
  • MSET has range of [-1,2,0]
  • XREAD has range of [-1,1,2]
  • ZUNION has start_search of type index with value 1 and find_keys of type keynum with value [0,1,1]
  • AI.DAGRUN has start_search of type keyword with value [“LOAD“,1] and find_keys of type keynum with value [0,1,1] (see https://oss.redislabs.com/redisai/master/commands/#aidagrun)

Note: this solution is not perfect as the module writers can come up with anything, but at least we will be able to find the key args of the vast majority of commands.
If one of the above specs can’t describe the key positions, the module writer can always fall back to the getkeys-api option.

Some keys cannot be found easily (KEYS in MIGRATE: Imagine the argument for AUTH is the string “KEYS” - we will start searching in the wrong index). The guarantee is that the specs may be incomplete (incomplete will be specified in the spec to denote that) but we never report false information.

We will expose these key specs in the COMMAND command so that clients can learn, on startup, where the keys are for all commands instead of holding hardcoded tables or use COMMAND GETKEYS in runtime.

Details: #7297 (comment) (possibly outdated).
PR: #8324

Needless to say, when a module registers a command it should provide all that info as well, old modules that don't will only expose a basic index and range.

Command tips

The redisCommand structure should have a list of arbitrary “tips” (strings). Unlike the existing command flags (saved as a bitfield), which are internal by nature (tells Redis how to handle commands) the command tips are “external”: They are used by proxies and clients to learn how to handle the command (for example, to which shard to send the command, among other stuff).
There will be a predefined list of common tips but we will have to support arbitrary tips as well.
We will need to expose a module API for adding tips to commands.

Needless to say that modules can add commands with either similar tips as built-in commands or even arbitrary ones.

Examples:

  • requesting_policy: to which shard(s) do we send the command?
    Example: GET goes to a single shard (depends on the slot of the key) while PING goes to all shards.
  • replying_policy: what do we do with all the replies we got from the shards? (in case the requsting_policy is to more than one shard).
    Example: For CONFIG SET the proxy needs to make sure all shards returned OK. For DBSIZE the proxy needs to aggregate the output from all shards and synthesize a reply of its own.

Commands' JSON

We want Redis to have all necessary command information, so we need to save all information currently exposed in redis.io's commands.json:

  • command structure, including all args and their types, what’s an enum, what’s optional, etc.
  • complexity information
  • short description
  • which version introduced the command
  • datatype (could be “general”)

We want Redis to have the ability to generate the old commands.json, making it the only source of truth regarding command information

We will have a JSON file per command
At the top level, .json contains a JSON object (hashmap), mapping command fields to their values.
Possible fields:

  • summary: string describing the command's functionality
  • complexity: string describing the command's performance complexity
  • since: version-string with the first version number supporting the command
  • group: string with the name of the command group the command belongs to
  • arity: number of args this command has (where negative values mean “at least”)
  • return_summary: string representing the return type(s)
  • return_types: map of resp_version -> list(return-type-string)
  • command_flags: list of strings
  • acl_categories: list of strings
  • history: string representing behavior changes in the history of the command
  • key_specs: a list of objects (two keys each, start_search, find_keys - each of them is an object)
  • arguments: list of argument specification dictionaries (if undefined, command takes no arguments)

Each argument dictionary can have the following fields:

  • name: String. The name of the argument.
  • description: String. Short description of the argument (optional)
  • type: String. The type of argument. Possible values:
    • "string": a string-valued argument
    • "integer": an integer-valued argument
    • "double": a floating-point argument
    • "key": a string-valued argument representing a key in the datastore
    • "pattern": a string representing a glob-style pattern
    • "unix_time": integer-valued argument is a Unix timestamp value in seconds
    • "oneof": multiple options that mutually exclude each other. in this case the field "value" is a list of arguments
    • "block": not an individual argument, but a block of multiple arguments. in this case, the field "value" is a list of arguments
  • value: String or List. Either the name to display or a list of dictionaries (each is an argument, so arguments can be nested)
  • token: String. Name of the preceding token if exists (optional)
  • optional: Boolean. True if this argument is optional. (optional)
  • multiple: Boolean. True if this argument can be repeated multiple times. (optional)
  • since: String. The first version introduced this argument. (optional)

Examples:

{
    "SET": {
        "summary": "Set the string value of a key",
        "complexity": "O(1)",
        "since": "1.0.0",
        "group": "string",
        "arity": -3,
        "return_summary": "Simple string reply: OK if SET was executed correctly."
                "Null reply: (nil) if the SET operation was not performed because the user specified the NX or XX option but the condition was not met."
                "If the command is issued with the GET option, the above does not apply. It will instead reply as follows, regardless if the SET was actually performed:"
                "Bulk string reply: the old string value stored at key."
                "Null reply: (nil) if the key did not exist.",
        "return_types": {
            "2": ["+OK", "<bulk-string>", "<null-bulk-string>"],
            "3": ["+OK", "<bulk-string>", "<null>"],
        },
        "command_flags": [
            "write",
            "use-memory",
        ],
        "acl_categories": [
            "string",
        ],
        "key_specs": [
            {
                "start_search": {
                    "index": {
                        "pos": 1,
                    }
                },
                "find_keys": {
                    "range": {
                        "count": 1,
                        "step": 1,
                        "limit": 0,
                    }
                },
                "flags": [
                    "write",
                ],
            }
        ],
        "history": ">= 7.0: Allowed the NX and GET options to be used together.",
        "arguments": [
            {
                "name": "key",
                "type": "key",
                "value": "key",
            },
            {
                "name": "value",
                "type": "string",
                "value": "value",
            },
            {
                "name": "expire",
                "optional": true,
                "type": "oneof",
                "value": [
                    {
                        "name": "ex",
                        "since": "2.6.12",
                        "type": "integer",
                        "token": "EX",
                        "value": "seconds",
                    },
                    {
                        "name": "px",
                        "since": "2.6.12",
                        "type": "integer",
                        "token": "PX",
                        "value": "milliseconds",
                    },
                    {
                        "name": "keepttl",
                        "since": "6.0.0",
                        "type": null,
                        "token": "KEEPTTL",
                        "value": null,
                    },
                ],
            },
            {
                "name": "existence",
                "optional": true,
                "type": "oneof",
                "value": [
                    {
                        "name": "nx",
                        "type": null,
                        "token": "NX",
                        "value": null,
                    },
                    {
                        "name": "xx",
                        "type": null,
                        "token": "XX",
                        "value": null,
                    },
                ],
            },
            {
                "name": "get",
                "since": "6.2.0",
                "optional": true,
                "type": null,
                "token": "GET",
                "value": null,
            },
        ],
    }
}
{
    "XADD": {
        "summary": "Appends a new entry to a stream",
        "complexity": "O(1) when adding a new entry, O(N) when trimming where N being the number of entires evicted.",
        "since": "5.0.0",
        "group": "stream",
        "arity": -5,
        "return_summary": "Bulk string reply, specifically:"
            "The command returns the ID of the added entry. The ID is the one auto-generated if * is passed as ID argument, otherwise the command just returns the same ID specified by the user during insertion."
            "The command returns a Null reply when used with the NOMKSTREAM option and the key doesn't exist.",
        "return_types": {
            "2": ["<bulk-string>", "<null-bulk-string>"],
            "3": ["<bulk-string>", "<null>"],
        },
         "command_flags": [
            "write",
            "use-memory",
            "fast",
            "random",
        ],
        "acl_categories": [
            "stream",
        ],
        "key_specs": [
            {
                "start_search": {
                    "index": {
                        "pos": 1,
                    }
                },
                "find_keys": {
                    "range": {
                        "count": 1,
                        "step": 1,
                        "limit": 0,
                    }
                },
                "flags": [
                    "write",
                ],
            }
        ],
        "arguments": [
            {
                "name": "key",
                "type": "key",
                "value": "key",
            },
            {
                "name": "trimming",
                "optional": true,
                "type": "block",
                "value": [
                    {
                        "name": "strategy",
                        "optional": false,
                        "type": "oneof",
                        "value": [
                            {
                                "name": "maxlen",
                                "type": null,
                                "token": "MAXLEN",
                                "value": null,
                            },
                            {
                                "name": "minid",
                                "since": "6.2.0",
                                "type": null,
                                "token": "MINID",
                                "value": null,
                            },
                        ],
                    },
                    {
                        "name": "operator",
                        "optional": true,
                        "type": "oneof",
                        "value": [
                            {
                                "name": "exact",
                                "type": null,
                                "token": "=",
                                "value": null,
                            },
                            {
                                "name": "inexact",
                                "type": null,
                                "token": "~",
                                "value": null,
                            },
                        ],
                    },
                    {
                        "name": "threshold",
                        "type": "integer",
                        "value": "threshold",
                    },
                    {
                        "name": "limit",
                        "optional": true,
                        "since": "6.2.0",
                        "type": "integer",
                        "token": "LIMIT",
                        "value": "limit",
                    },
                ],
            },
            {
                "name": "nomakestream",
                "optional": true,
                "since": "6.2.0",
                "type": null,
                "token": "NOMKSTREAM",
                "value": null,
            },
            {
                "name": "id",
                "type": "oneof",
                "value": [
                    {
                        "name": "auto",
                        "type": null,
                        "token": "*",
                        "value": null,
                    },
                    {
                        "name": "specific",
                        "type": null,
                        "token": "ID",
                        "value": null,
                    },
                ],
            },
            {
                "name": "fieldandvalues",
                "type": "block",
                "multiple": true,
                "value": [
                    {
                        "name": "field",
                        "type": "string",
                        "value": "field",
                    },
                    {
                        "name": "value",
                        "type": "string",
                        "value": "value",
                    },
                ],
            },
        ],
    }
}

Updates
21.8.12: command tips should be configurable and not hardcoded: user can provide tips.json on startup. that way, every user can provide its own tips. it should support module commands as well (means we need to parse tips.json only after all modules were loaded). what about loading modules in runtime?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions