-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Add reply_schema to command json files (internal for now) #10273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
oranagra
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
please add a better top comment before we call others to take a look.
it should include some description of the purpose and design / decisions we already took.
you can maybe copy some from this comment.
also, in order to better judge this, it would be nice to look at what the user facing documentation is gonna look like, but maybe a detailed top comment can fill that place for now.
please also add some todo list which states the status of the task and the remaining issues (like completing the data for all commands, providing module API, and writing docs.)
|
@redis/core-team please have a look and comment. |
|
Just my 2 cents from a first glance. Overall I think this will help to generate type safe interfaces for client libraries. Variants
To get a clean high-level API I would generate one method for each command (maybe this would be considered a medium-level API)) and if a subset of arguments causes a specific reply schema to be used, I would create a method for that too (this would be the high-level API). Getting to my favorite example: CLIENT KILL While annotating the argument for the CLIENT KILL command seems to be the easier solution as you can simply say e.g. "yields: variant a of response_schema top level oneOf", this could get messy if the dependency is not as clear as with CLIENT KILL. E.g. there is no top level oneof argument type. Thus, I guess augmenting the reply schema would be the better (more general) solution as you can directly annotate which arguments in which combination yield the specific variant. I mean in the end we (as in client library authors) can aggregate these information ourselves and feed that into the code generator but I think having this upstream would be a benefit to all client lib generators. Unique idsAnother thing I would like to see would be |
|
I agree we better solve this mapping of arguments to reply types. |
|
i see we have an issue with 32bit build and one of the new GEO tests. |
revert coverage test for LATENCY GRAPH
|
with fix to GEO on 32bit: https://github.com/guybe7/redis/actions/runs/4374580524 |
|
@redis/core-team FYI, i would like to merge this one tomorrow.. let me know if you have any concerns. |
|
I have no concerns, the actual change to the core seems small. |
|
didn't CR, only a question about the test part: does it validate the key_specs and other sections? |
|
@soloestoy no, it only validates the reply schema |
|
We can log key lookups of each command and then try to match these. But it's a topic for a different campaign.. |
Redis build runs `utils/generate-command-code.py` if there is a change in `src/commands/*.json` files. In #10273, we used f-string format in this script. f-string feature was introduced in python3.6. If a system has an earlier python version, build might fail. Added some changes to make that script compatible with earlier python versions.
The reason is in reply-schemas-validator, the resp of the
client we create will be client_default_resp (currently 3):
```
client *createClient(connection *conn) {
client *c = zmalloc(sizeof(client));
#ifdef LOG_REQ_RES
reqresReset(c, 0);
c->resp = server.client_default_resp;
#else
c->resp = 2;
#endif
}
```
But current_resp3 in redis-cli will be inconsistent with it,
the test adds a simple hello 3 to avoid this failure, test
was added in redis#11873.
Added help descriptions for log-req-res, force-resp3 and
dont-pre-clean options, options was added in redis#10273
The reason is in reply-schemas-validator, the resp of the
client we create will be client_default_resp (currently 3):
```
client *createClient(connection *conn) {
client *c = zmalloc(sizeof(client));
#ifdef LOG_REQ_RES
reqresReset(c, 0);
c->resp = server.client_default_resp;
#else
c->resp = 2;
#endif
}
```
But current_resp3 in redis-cli will be inconsistent with it,
the test adds a simple hello 3 to avoid this failure, test
was added in #11873.
Added help descriptions for dont-pre-clean option, it was
added in #10273
Work in progress towards implementing a reply schema as part of COMMAND DOCS, see redis#9845 Since ironing the details of the reply schema of each and every command can take a long time, we would like to merge this PR when the infrastructure is ready, and let this mature in the unstable branch. Meanwhile the changes of this PR are internal, they are part of the repo, but do not affect the produced build. ### Background In redis#9656 we add a lot of information about Redis commands, but we are missing information about the replies ### Motivation 1. Documentation. This is the primary goal. 2. It should be possible, based on the output of COMMAND, to be able to generate client code in typed languages. In order to do that, we need Redis to tell us, in detail, what each reply looks like. 3. We would like to build a fuzzer that verifies the reply structure (for now we use the existing testsuite, see the "Testing" section) ### Schema The idea is to supply some sort of schema for the various replies of each command. The schema will describe the conceptual structure of the reply (for generated clients), as defined in RESP3. Note that the reply structure itself may change, depending on the arguments (e.g. `XINFO STREAM`, with and without the `FULL` modifier) We decided to use the standard json-schema (see https://json-schema.org/) as the reply-schema. Example for `BZPOPMIN`: ``` "reply_schema": { "oneOf": [ { "description": "Timeout reached and no elements were popped.", "type": "null" }, { "description": "The keyname, popped member, and its score.", "type": "array", "minItems": 3, "maxItems": 3, "items": [ { "description": "Keyname", "type": "string" }, { "description": "Member", "type": "string" }, { "description": "Score", "type": "number" } ] } ] } ``` #### Notes 1. It is ok that some commands' reply structure depends on the arguments and it's the caller's responsibility to know which is the relevant one. this comes after looking at other request-reply systems like OpenAPI, where the reply schema can also be oneOf and the caller is responsible to know which schema is the relevant one. 2. The reply schemas will describe RESP3 replies only. even though RESP3 is structured, we want to use reply schema for documentation (and possibly to create a fuzzer that validates the replies) 3. For documentation, the description field will include an explanation of the scenario in which the reply is sent, including any relation to arguments. for example, for `ZRANGE`'s two schemas we will need to state that one is with `WITHSCORES` and the other is without. 4. For documentation, there will be another optional field "notes" in which we will add a short description of the representation in RESP2, in case it's not trivial (RESP3's `ZRANGE`'s nested array vs. RESP2's flat array, for example) Given the above: 1. We can generate the "return" section of all commands in [redis-doc](https://redis.io/commands/) (given that "description" and "notes" are comprehensive enough) 2. We can generate a client in a strongly typed language (but the return type could be a conceptual `union` and the caller needs to know which schema is relevant). see the section below for RESP2 support. 3. We can create a fuzzer for RESP3. ### Limitations (because we are using the standard json-schema) The problem is that Redis' replies are more diverse than what the json format allows. This means that, when we convert the reply to a json (in order to validate the schema against it), we lose information (see the "Testing" section below). The other option would have been to extend the standard json-schema (and json format) to include stuff like sets, bulk-strings, error-string, etc. but that would mean also extending the schema-validator - and that seemed like too much work, so we decided to compromise. Examples: 1. We cannot tell the difference between an "array" and a "set" 2. We cannot tell the difference between simple-string and bulk-string 3. we cannot verify true uniqueness of items in commands like ZRANGE: json-schema doesn't cover the case of two identical members with different scores (e.g. `[["m1",6],["m1",7]]`) because `uniqueItems` compares (member,score) tuples and not just the member name. ### Testing This commit includes some changes inside Redis in order to verify the schemas (existing and future ones) are indeed correct (i.e. describe the actual response of Redis). To do that, we added a debugging feature to Redis that causes it to produce a log of all the commands it executed and their replies. For that, Redis needs to be compiled with `-DLOG_REQ_RES` and run with `--reg-res-logfile <file> --client-default-resp 3` (the testsuite already does that if you run it with `--log-req-res --force-resp3`) You should run the testsuite with the above args (and `--dont-clean`) in order to make Redis generate `.reqres` files (same dir as the `stdout` files) which contain request-response pairs. These files are later on processed by `./utils/req-res-log-validator.py` which does: 1. Goes over req-res files, generated by redis-servers, spawned by the testsuite (see logreqres.c) 2. For each request-response pair, it validates the response against the request's reply_schema (obtained from the extended COMMAND DOCS) 5. In order to get good coverage of the Redis commands, and all their different replies, we chose to use the existing redis test suite, rather than attempt to write a fuzzer. #### Notes about RESP2 1. We will not be able to use the testing tool to verify RESP2 replies (we are ok with that, it's time to accept RESP3 as the future RESP) 2. Since the majority of the test suite is using RESP2, and we want the server to reply with RESP3 so that we can validate it, we will need to know how to convert the actual reply to the one expected. - number and boolean are always strings in RESP2 so the conversion is easy - objects (maps) are always a flat array in RESP2 - others (nested array in RESP3's `ZRANGE` and others) will need some special per-command handling (so the client will not be totally auto-generated) Example for ZRANGE: ``` "reply_schema": { "anyOf": [ { "description": "A list of member elements", "type": "array", "uniqueItems": true, "items": { "type": "string" } }, { "description": "Members and their scores. Returned in case `WITHSCORES` was used.", "notes": "In RESP2 this is returned as a flat array", "type": "array", "uniqueItems": true, "items": { "type": "array", "minItems": 2, "maxItems": 2, "items": [ { "description": "Member", "type": "string" }, { "description": "Score", "type": "number" } ] } } ] } ``` ### Other changes 1. Some tests that behave differently depending on the RESP are now being tested for both RESP, regardless of the special log-req-res mode ("Pub/Sub PING" for example) 2. Update the history field of CLIENT LIST 3. Added basic tests for commands that were not covered at all by the testsuite ### TODO - [x] (maybe a different PR) add a "condition" field to anyOf/oneOf schemas that refers to args. e.g. when `SET` return NULL, the condition is `arguments.get||arguments.condition`, for `OK` the condition is `!arguments.get`, and for `string` the condition is `arguments.get` - redis#11896 - [x] (maybe a different PR) also run `runtest-cluster` in the req-res logging mode - [x] add the new tests to GH actions (i.e. compile with `-DLOG_REQ_RES`, run the tests, and run the validator) - [x] (maybe a different PR) figure out a way to warn about (sub)schemas that are uncovered by the output of the tests - redis#11897 - [x] (probably a separate PR) add all missing schemas - [x] check why "SDOWN is triggered by misconfigured instance replying with errors" fails with --log-req-res - [x] move the response transformers to their own file (run both regular, cluster, and sentinel tests - need to fight with the tcl including mechanism a bit) - [x] issue: module API - redis#11898 - [x] (probably a separate PR): improve schemas: add `required` to `object`s - redis#11899 Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: Hanna Fadida <hanna.fadida@redislabs.com> Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Shaya Potter <shaya@redislabs.com>
|
do we need to add reply_schema for command.json? (the same reply as COMMAND INFO) |
|
i guess that technically we do. but:
@guybe7 maybe you remember some other reason why we didn't do that? |
Work in progress towards implementing a reply schema as part of COMMAND DOCS, see #9845
Since ironing the details of the reply schema of each and every command can take a long time, we would like to merge this PR when the infrastructure is ready, and let this mature in the unstable branch.
Meanwhile the changes of this PR are internal, they are part of the repo, but do not affect the produced build.
Background
In #9656 we add a lot of information about Redis commands, but we are missing information about the replies
Motivation
Schema
The idea is to supply some sort of schema for the various replies of each command.
The schema will describe the conceptual structure of the reply (for generated clients), as defined in RESP3.
Note that the reply structure itself may change, depending on the arguments (e.g.
XINFO STREAM, with and without theFULLmodifier)We decided to use the standard json-schema (see https://json-schema.org/) as the reply-schema.
Example for
BZPOPMIN:Notes
ZRANGE's two schemas we will need to state that one is withWITHSCORESand the other is without.ZRANGE's nested array vs. RESP2's flat array, for example)Given the above:
unionand the caller needs to know which schema is relevant). see the section below for RESP2 support.Limitations (because we are using the standard json-schema)
The problem is that Redis' replies are more diverse than what the json format allows. This means that, when we convert the reply to a json (in order to validate the schema against it), we lose information (see the "Testing" section below).
The other option would have been to extend the standard json-schema (and json format) to include stuff like sets, bulk-strings, error-string, etc. but that would mean also extending the schema-validator - and that seemed like too much work, so we decided to compromise.
Examples:
[["m1",6],["m1",7]]) becauseuniqueItemscompares (member,score) tuples and not just the member name.Testing
This commit includes some changes inside Redis in order to verify the schemas (existing and future ones) are indeed correct (i.e. describe the actual response of Redis).
To do that, we added a debugging feature to Redis that causes it to produce a log of all the commands it executed and their replies.
For that, Redis needs to be compiled with
-DLOG_REQ_RESand run with--reg-res-logfile <file> --client-default-resp 3(the testsuite already does that if you run it with--log-req-res --force-resp3)You should run the testsuite with the above args (and
--dont-clean) in order to make Redis generate.reqresfiles (same dir as thestdoutfiles) which contain request-response pairs.These files are later on processed by
./utils/req-res-log-validator.pywhich does:Notes about RESP2
ZRANGEand others) will need some special per-command handling (so the client will not be totally auto-generated)Example for ZRANGE:
Other changes
TODO
SETreturn NULL, the condition isarguments.get||arguments.condition, forOKthe condition is!arguments.get, and forstringthe condition isarguments.get- Add a "condition" field to anyOf/oneOf schemas in the commands' reply schema #11896runtest-clusterin the req-res logging mode-DLOG_REQ_RES, run the tests, and run the validator)requiredtoobjects - Add support in required properties concept for reply schema #11899