RFC: Support a format description of our JSON-RPC interface#34683
RFC: Support a format description of our JSON-RPC interface#34683willcl-ark wants to merge 14 commits intobitcoin:masterfrom
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Concept ACK |
janb84
left a comment
There was a problem hiding this comment.
Concept ACK.
This was already useful for bitcoin-tui
test/functional/rpc_openrpc.py
Outdated
| committed = json.loads(openrpc_path.read_text(encoding="utf-8")) | ||
|
|
||
| if generated != committed: | ||
| self.log.info("Generated doc/openrpc.json:\n%s", |
There was a problem hiding this comment.
I guess it could be distracting to have to see every rpc change twice. Maybe the test could just check that generating the json does not crash?
In theory, there could be a diff output like #26706 (comment) somewhere for each pull request, but that seems better to be optional and hidden.
the doc/openrpc.json could have the same This is a placeholder file. content like the manpages.
(Of course it is fine to keep both commits in this pull for now, so that the json is visible here, and only drop them before merge)
There was a problem hiding this comment.
Thank you for this comment.
Having the json checked in as a skeleton (like manpages) is a nice approach. It's kind of the opposite to stickies thoughts, so I guess I'll see which folks seem to generally prefer (and will leave the file in this PR for now for reference).
I guess it could be distracting to have to see every rpc change twice. Maybe the test could just check that generating the json does not crash?
This is also a nice idea/possibility. If we are not worrying about having a live artifact in this repo, then I agree this is almost certainly how it should work.
There was a problem hiding this comment.
I am also thinking about devs on platforms that do not support a full build (IIRC external signer was not present on Windows for some time?). I guess they could manually undo the hunk that removes the $feature-related RPC from the json, or copy-paste the full json output from the CI log somehow.
In any case, either way seems fine and should be trivial to adjust post-merge. This is just a nit.
|
Concept ACK |
|
Big concept ACK, very nice work. If we want to cram less functionality in Bitcoin Core, we need to make it easier for people to build on top of it. A change like this will make it much easier for consumers to use the RPC. In addition, I think it will also help with our RPC stability guarantees and testing, and to make it easier for consumers to quickly find and incorporate any changes that have been made to the interface.
This seems like a feature, not a bug? As long as devs can compile the binaries they need to test/dev, I think this doesn't impose any unnecessary overhead?
If OpenRPC is the best choice for our repo (from quick glance, it seems to work well), and we have enough support to commit to keeping our RPC OpenRPC compatible, I think it absolutely should live in this repo, so that we can quickly and strictly enforce continued compatibility with the spec. I'm less opinionated on whether the output json should live in this repo. I think it's fine to do so - the changes should be a LOT less frequent than e.g. doxygen changes, and it's nice having it easily accessible. However, I think it's also acceptable for users to generate this on-the-fly, or use any other hosted mirror of this documentation, which should be trivial to do. |
I don't really see why it wouldn't be better to have a If people want to grab the artifact from the web instead of their node, then providing a url on bitcoincore.org (like https://bitcoincore.org/en/doc/30.0.0/rpc/network/disconnectnode/ eg) seems plausible enough. Having the json output in the repo just seems annoying to me; more data, more conflicts, more diffs to check, more things you have to manually mess around with that could be automated. For the test suite, picking a rarely updated but somewhat interesting RPC, and checking that the "getopenrpcinfo" result for that command matches a fixed value, and that the commands listed match the output of "help" is probably fine, or at least seems like it would be to me. OpenRPC seems fine to me; doesn't seem like there's much benefit in worrying about the details, beyond getting something that's moderately machine readable. |
Good question, sorry I didn't answer in OP initially, I had meant to provide a comparison of sorts... I tried this approach, and basically my reasoning was that you end up with a much larger c++ change (which I thought might be less desirable), although you get the nice benefits you note. I think it could be useful to codify the possible approaches I see, and their impact on this codebase. So roughly in ascending order of LoC needed in this repo:
I am going to rework my branch which outputs the json directly from RPC, and will post an update here with a comparison.
agree, this would be fine.
Yeah I think I agree here too. I did test breaking an RPC to see how annoying it was to detect and fix. Quite annoying, was the answer.
👍🏼 Thank you for articulating this, as I did spend a fair bit of time "worrying" about which format to select, but I see now you're correct; as long as it's structured (i.e valid json) and machine readable, it should be OK, and people can easily transform it into their own formats. |
|
Hey @willcl-ark, have you checked my comment on corepc? Is something like that you wanted to build? |
2846fab to
a3d6298
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
5f58a98 to
0857f2b
Compare
@ajtowns I have updated the draft here now to work like this. It was not as much extra c++ code in sever.cpp as I thought, which seems nice, and I agree it's a better approach. Comparing the python output to the c++ output only differed on formatting. New output file from a
@natobritto I had not heard of this one, no. I knew there were one or two? unmaintained rust bitcoin core json rpc libs, but pretty sure neither was this one. I will check it out. |
|
I like the new C++ approach! I found one conundrum: As per openRPC output the command Details {
"name": "getopenrpcinfo",
"description": "Returns an OpenRPC document for currently available RPC commands.",
"params": [
],
"result": {
"name": "result",
"schema": {
"type": "object",
"properties": {
"openrpc": {
"type": "string",
"description": "OpenRPC specification version."
},
"info": {
"type": "object",
"properties": {
},
"additionalProperties": false,
"description": "Metadata about this JSON-RPC interface."
},
"methods": {
"type": "array",
"items": {
"type": "object",
"properties": {
},
"additionalProperties": false
},
"description": "Documented RPC methods."
}
},
"additionalProperties": false,
"required": [
"openrpc",
"info",
"methods"
]
}
},
"x-bitcoin-category": "control"
},The output is enriched with an info object filled (so it has additional properties) : "info": {
"title": "Bitcoin Core JSON-RPC",
"version": "v30.99.0-dev",
"description": "Autogenerated from Bitcoin Core RPC metadata."
},This invalidates the generated openRPC spec |
0857f2b to
726260c
Compare
Doh! You're right that the getopenrpcinfo schema was indeed self-invalidating! I've updated the RPCResult to fully describe the info fields (title, version, description) and the method object structure (name, description, params, result, x-bitcoin-category), including the param and result Content Descriptor shapes. Dynamic JSON Schema fields within those use unconstrained schemas since their structure varies per method. I don't think there's anything we can do about that. There are also a few bits related to optional args before required, but those are legacy RPC artefacts. I also added a sample helper script I used to validate that it's now actually valid, per the meta-schema (you can run this yourself with |
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
726260c to
d4a7546
Compare
hodlinator
left a comment
There was a problem hiding this comment.
Concept ACK
My gut says we should be maintaining a shared spec and generate stub C++ binding code from it, as is done for .capnp files. However, the current reverse approach of generating the spec from the C++ declarations has the following benefits:
- Doesn't add more codegen complexity to the build.
- Doesn't add complexity of keeping generated C++ code and version controlled code in sync.
- Doesn't touch existing C++ RPC declarations.
- If we were to decide we want to switch to non-C++ spec + codegen, the current approach gives us the raw material for that spec.
doc/openrpc.json
Outdated
| "info": { | ||
| "title": "Bitcoin Core JSON-RPC", | ||
| "version": "v30.99.0-dev", | ||
| "description": "Placeholder OpenRPC skeleton. Regenerate this file for release candidates using `bitcoin-cli getopenrpcinfo`." |
There was a problem hiding this comment.
Not clear to me if this should be in this repo, or done like the rpc help at https://github.com/bitcoin-core/bitcoincore.org/blob/master/contrib/doc-gen/generate.go
There was a problem hiding this comment.
Yes, I think the final 4 commits here are up for conceptual-inclusion review; we could... drop them all. The other extreme is to vendor the schema into this repo and use that in the functional test. Or we could keep something like the basic sanity test I have here and drop the final 3.
IMO if we want to distribute in guix tarballs keeping the skeleton file checked in isn't too bad. But I'm not strongly wedded to the idea either. If we didn't want to distribute a static file, we could drop both the skeleton file and guix commits.
There was a problem hiding this comment.
I think it's very useful to be able to interrogate bitcoind for its schema, it makes the schema much more discoverable, and makes it much more likely that the code will be kept up to date. It's also probably the path to integrate it with the least code, since there's minimal additional code for a new API call, and the C++ objects which represent the RPC API are all available to the schema generator.
|
Concept ACK |
|
Concept ACK Glad to see this moving forward. I recently experimented with a similar OpenRPC generation approach, so I have some context here. I was able to use the generated spec to codegen types for Rust successfully, which gives me confidence that the overall approach is sound. This review covers the spec output only. I will review the code separately.
"oneOf": [
{ "type": "string" },
{ "type": "string" }
]
Dummy/ignored params should carry some annotation, like There are no Similarly, many hex string fields use The spec currently inlines everything. It may be worth leveraging OpenRPC reuse via
"schema": {
"type": "string",
"enum": ["unset", "economical", "conservative"]
}This likely affects other params too ( On the spec file: I agree with @ajtowns that checking the generated JSON into the repo creates ongoing friction for contributors touching RPCs. Consumers who want diffs between releases can run Happy to help later with follow-up patches for any of these. |
d4a7546 to
c312382
Compare
CRPCTable::help() takes std::string_view but server.h relies on transitive includes for it. Add the direct include (probably makes iwyu happier too?)
After removing the last CRPCCommand pointer for a given name, erase the now-empty vector from mapCommands. Without this, listCommands() returns the name of a fully removed command because it iterates mapCommands keys unconditionally. For example, when unloading the wallet the RPCs are deregistered, and this prevents getopenrpcinfo from returning non-existant RPCs.
RPCResult::Type::ANY triggers NONFATAL_UNREACHABLE() in ToSections(), which crashes the help() RPC when a command uses Type::ANY in a nested result field. Previously this was never hit because Type::ANY was only used as a top-level alternate result type, filtered out before ToSections() is called. getopenrpcinfo() will use this result type, so render it like other types allowing it to be used in nested result definitions like schema.
DecodeTxDoc() uses positional bool arguments to control which
optional fields appear in decoded transaction result documentation.
As more call sites need different combinations of prevout, fee, and
hex fields, this becomes harder to read and more error-prone.
Replace it with TxDoc(), which takes a TxDocOptions struct with
named fields defaulting to false. Callers can then opt into the
needed fields explicitly, e.g.
TxDoc("The transaction id", TxDocOptions{.prevout = true, .fee = true})
This is a pure refactor. Help output is unchanged. Move the helper
to rpc/util.{h,cpp} so it can be reused across subsystems in
follow-up commits.
decodepsbt describes its "tx" and "non_witness_utxo" result fields using ELISION prose referring to decoderawtransaction output. This works for human-readable help, but schema generators cannot resolve those references into structured field definitions. Replace both ELISION entries with TxDoc() so the transaction layout is represented directly in RPCHelpMan metadata.
The verbosity=2 result for getrawtransaction contains ELISION prose
both at the top level ("Same output as verbosity = 1") and within
each vin entry. These references are readable for humans but do not
provide structured result metadata.
Replace them with the full result structure using Cat() and TxDoc()
with prevout enabled.
c312382 to
6ae1e8d
Compare
Thanks for the review! I have picked your changes from #34764 (thanks) and stacked them (with a small fixup) in here, with additional fixes for the following from your review:
I have also dropped the skeleton document, and Guix packaging commits. |
|
If we are generally Concept ACK in here, it would probably make sense to split this up as:
|
getblock defines four verbosity levels. The verbosity=2 and verbosity=3 results use ELISION entries to refer to lower verbosity layouts, even though the only material difference between levels is the shape of the "tx" field. Introduce GetBlockFields(RPCResult tx_result) to build the full block result structure while substituting the appropriate "tx" definition in the correct position. This avoids ELISION prose and also avoids the duplicate "tx" field a naive Cat() approach would produce. Remove the now-unused getblock_vin helper.
The "removed" array in listsinceblock is described with an ELISION entry referring to the "transactions" array layout. This is clear in human-readable help, but it leaves the result structure implicit. Extract ListSinceBlockTxFields() and use it for both arrays so the transaction layout is defined once and represented explicitly in RPCHelpMan metadata.
This is a basic smoke test for the generated openrpc document. Verify the RPC returns a serializable document (valid JSON) and that the top-level shape is ~ correct: - openrpc is a string - info is an object - methods is an array
6ae1e8d to
eebb30f
Compare
Ref #29912.
This draft PR adds a machine-readable OpenRPC 1.3.2 specification of out JSON-RPC interface, auto-generated from existing
RPCHelpManmetadata.There is currently no formal, machine-readable specification of the RPC API. As discussed in #29912, this has knock-on consequences:
This draft builds on prior art by @casey and the observations by @laanwj, @stickies-v, @kilianmh, @hodlinator, and @cdecker in #29912. Casey's work demonstrated that RPCHelpMan already contains all the structured information needed, which makes this feasible without duplicating any API definitions.
This differs from Casey's branches in that it uses the OpenRPC standard rather than an ad-hoc format or raw JSON Schema.
Why OpenRPC
I seletced OpenRPC for a number of reasons:
Approach
RPCHelpMan metadata → getopenrpcinfo RPC → OpenRPC JSONTradeoffs
vs an ad-hoc format OpenRPC gives us interoperability with the (admittedly surprisingly limited) tooling, documentation generators, code generators, and validators, at the cost of needing x-bitcoin-* extensions for Bitcoin-specific concepts. As Casey noted after trying both approaches, JSON Schema "is probably not a great fit". OpenRPC's method-level framing on top of JSON Schema addresses the ergonomic issues while keeping the schema benefits. After testing both, I think I agree.
Types: JSON Schema cannot natively express units like "BTC/kvB" or semantic distinctions like "this hex string is a txid." These are preserved in description text and x-bitcoin-type-str extensions, but are not machine-enforceable from the schema alone. This was recognized as a fundamental limitation by several commenters in the issue. Future work could enrich the x-bitcoin-* extensions with a more structured unit vocabulary.
Some RPCs return different types depending on argument values (e.g. verbosity levels). These are represented as
oneOfin the result schema with free-text condition descriptions. This is accurate but not fully machine-parseable — a code generator cannot automatically determine which result variant corresponds to which argument value without parsing the description. I still we have enough information to satisfy humans an agents alike though.Regenerating the spec
The current functional test simply tests that the RPC runs, and produces valid JSON. We might want to consider extending this..
The RPC output documents which RPCs are available for any given built binary.
Discussion questions
My personal thoughts are that this is very nice to have.