Skip to content

OpenApi Export and Swagger UI Binding#1671

Merged
vigoo merged 4 commits intogolemcloud:mainfrom
Nanashi-lab:openapi
Aug 7, 2025
Merged

OpenApi Export and Swagger UI Binding#1671
vigoo merged 4 commits intogolemcloud:mainfrom
Nanashi-lab:openapi

Conversation

@Nanashi-lab
Copy link
Copy Markdown
Contributor

@Nanashi-lab Nanashi-lab commented May 30, 2025

/closes #1178
/claim #1178

Docs is still left to do.
Golem-Cli also has been updated.

I am ready for a review @afsalthaj I have added the explanation comment on content-type and status. Do tell if I have that right ? and if so is the current way of handling it is robust ?

Link to Videos and Files -

Video One - Link
This covers the Export Endpoint, and the Export CLI Command, and Round-Trip Test

  1. Add a bunch of API's
  2. Export the APIs in OpenApi yaml format
  3. Delete all the original APIs
  4. Import the OpenApi yaml APIs
  5. Checkout the /export endpoint

Video Two - Link
This covers the Swagger CLI Command, and Swagger UI Binding

  1. Add Api for test:llm from golem-llm
  2. Deploy the Api which includes Swagger UI binding
  3. Access the Swagger UI binding endpoint and run through the invoke tests
  4. Showcase swagger command in CLI for both undeployed and deployed API

If you want to check the actual conversion between AnalysedType and OpenApi Spec, you can either

  1. Get the wasm files, api files and openapi conversion files from repo link
  2. or You can find detailed conversion in file api_oas_convert_tests.rs

@Nanashi-lab Nanashi-lab marked this pull request as draft May 30, 2025 06:04
@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented May 31, 2025

The Changes Requested

Done

  • Improved the de-serialization for GatewayBindingType, we no longer use custom de-serialization, which was previously used for wit-worker
  • export_api_definition name has been changed export_openapi_spec
  • Removed the pub from name and version in ResolvedGatewayBindingComponent
  • We now use CompiledHttpApiDefinition instead of HttpApiDefinitionResponseData
  • Removed the macro, now it is a function, for figuring out the analysed type to OpenApi for Integers
  • If all {header, status, body} are missing, it tries to match the whole record to AnalysedType, if there is no record, it tries to match it what is available with AnalaysedType
  • default_object_schema was a mistake, now if there is no requestbody or no response, it sends None. So swaggerui understand there is no request or no response.
  • _ = "Response" , we now use http crate to figure out the Description
  • We now use CompiledRoute instead of RouteResponseData, part of the change from moving to CompiledHttpApiDefinition
  • request.header are now parsed and part of the OpenApi Schema
  • we do now parse header, body and status into separate parts and we use this data to generate the yaml
  • We reuse deserialization for names, like "swagger-ui", "default", "file-server" from golem-common
  • The Cors response is now directly calculated using from_http_cors which converts into a valid response
  • Swagger-UI Binding, now has the openapi_yaml spec, When compiledhttp is created, we compile with openapi_yaml set to none, once it is compiled, we generate the open api spec, update all swaggerui route. The swagger ui handler now uses this directly to generate the swagger-ui. A Cleaner solution
  • workerNameInput, is not there are in test anymore since we moved to CompiledHttpApiDefinition
  • One top of all the tests before, I added one api with 10 routes, each testing a wide variety of Analysed type.

Minor Unresolved

  • route.path is AllPathPatterns, hence the .to_string()
  • For query, path and header, I haven't implemented a is_primitive check ? I was not sure what to do in the else part the statement.

Major Unresolved

  • Content-Type, json vs text. Because RibOutputMapping only has the type value, and not the literal value, this is currently not possible. headers and status are values which come from runtime. There will be a comment with more detailed explanation below.

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented May 31, 2025

@afsalthaj this is the explanation. I have addressed almost all comments from old PR, with the major exception of content-type in response.

Content-Type and Status

RibOutputTypeInfo only contains the AnalysedType, and not the actual value of status and header (Content-Type). The exact values are calculated at runtime.

Both are valid, working rib-script. In first we can set status at runtime, and in second we can set content-type during runtime.

let input = request.body.input;
let worker = instance("worker");
let result = worker.echo("Test");
{status: input: u64, body: result}
let input: string = request.body.input;
let worker = instance("worker-static");
let result = worker.echo("Test", 3);
{headers: {Content-Type: input, userid: "foo"}, status: 200: u64, body: result}

For the Status Case -

I determine the status by using method (POST -> 201, GET-> 200, and so on) and for all other cases (Where rib may output multiple types {200, "Succes"} and {400, "Error"}) we use Default Response

  /v0.0.1/test1:
    post:
      responses:
        default:
          description: Created
          content:
            application/json:
              schema:
                type: string
        '201':
          description: Created
          content:
            application/json:
              schema:
                type: string

For the Content-Type Case -

Current Golem-Rib behavior for String and other Primitive types, if you try application/text or application/octet-stream or any valid name. It will try to output it in that form, including image/png. For application/json which is the default, it outputs json.

If the AnalysedType is Complex, for all other content-type it will output a messaging "So and so analaysed-type could not be translated to content-type". for application/json it will output the json

In the first-case currently in api-oas-convert we pass application-json, if swagger ui receives application-text it handles it gracefully.

@afsalthaj
Copy link
Copy Markdown
Contributor

Thanks @Nanashi-lab for raising the PR again

Content-Type, json vs text. Because RibOutputMapping only has the type value, and not the literal value, this is currently not possible. headers and status are values which come from runtime. There will be a comment with more detailed explanation below.

Could you please explain this in detail? All information should still be available in the rib output type info.

Also I hope this PR addresses the concerns raised I in this PR. #1454

@Nanashi-lab Nanashi-lab marked this pull request as ready for review May 31, 2025 08:54
@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Jun 1, 2025

More thoughts on RibOutputTypeInfo on figuring exact values of status and content-type
Below is the static rib script, with no calls to workers, and predefined status, body and header values.

{headers: {Content-Type: \"application/text\", userid: \"foo\"}, status: 200: u64, body: \"Success\"}

RibOutputTypeInfo

          "analysedType": {
            "fields": [
              {
                "name": "headers",
                "typ": {
                  "fields": [
                    {
                      "name": "Content-Type",
                      "typ": {
                        "type": "Str"
                      }
                    },
                    {
                      "name": "userid",
                      "typ": {
                        "type": "Str"
                      }
                    }
                  ],
                  "type": "Record"
                }
              },
              {
                "name": "status",
                "typ": {
                  "type": "U64"
                }
              },
              {
                "name": "body",
                "typ": {
                  "type": "Str"
                }
              }
            ],
            "type": "Record"
          }

This is the RibOutputTypeInfo Structure, the actual values for status, body and header are in the rib-expression and the exact value is calculated when the api is called. RibOutputTypeInfo contains NameTypePair and not ValueAndType

We cannot extract content-type, status etc from rib-expression, because it can be set by the user at runtime, also string scanning might lead to false information from comments example // application/text.

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Jun 1, 2025

Another way to handle this would be, since application/text or application/xml both only handle primitive types like string

If the body is primitive, we can add all possible outcome

I am going to move forward with the solution @jdegoes & @afsalthaj if you have any comments on status and content-type issue, and how I am handling it, please tell me. (Look at comment above for more information)

responses:
  '200':
    description: Success
    content:
      application/json:
        type: string 
      application/xml:
        type: string
      text/plain:
        type: string
      application/text:
        type: string 

harshtech123 pushed a commit to harshtech123/golem-openapi that referenced this pull request Jun 3, 2025
* 1.2.2 RC1

* Following oss changes
@jdegoes
Copy link
Copy Markdown
Contributor

jdegoes commented Jun 3, 2025

@afsalthaj Perhaps Rib can return "singleton types" when possible.

It's not likely the content type or status code is a variable: it's likely it's a literal, embedded into the Rib script. If Rib had any kind of support for singleton types, e.g. SingletonStr(String), which implies a weak form of subtyping (because the literal type "foo" is a subtype of String, meaning it can be substituted anywhere a String expression is expected), then OpenAPI export would have all the information it needs to precisely specify status code and headers.

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Jun 3, 2025

Content-Type will be mostly literal. Status can have a few possible Literal values, for result and variant.
example for result,
status = match result { err(_) => 400: u32, ok(_} => 200: u32 }, If Riboutputtypeinfo could have a list of all possible values that would great.

In OpenAPI v3, */* is a valid content-type, so it will be possible to do something like this

responses:
  '200':
    description: Success
    content:
      application/json:
        type: string
      '*/*':
        type: string   

This works also for complex types, because for complex type in application/text or others, output string error

@Nanashi-lab Nanashi-lab marked this pull request as draft June 8, 2025 23:09
@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Jun 8, 2025

Most of the code is done. Once the merge between cloud and oss happens, I will rebase, and fix any leftover issues.

@afsalthaj
Copy link
Copy Markdown
Contributor

afsalthaj commented Jul 1, 2025

@jdegoes @Nanashi-lab

On the core issue of "Content-Type" is not known statically can be solved to some extent with the following idea.

This may not be perfect (meaning, I am still thinking), but this is my initial thought

We can inspect the original rib program's last expression (the return value) assuming its always a Record type.

This implies we mandate the idea that last expression in a Rib in the context of api gateway should be a Record with optional status code and headers.

Currently (in the main brain) the last expression can be "anything" (example: it can be a string) and can also a record of status code, body and headers (with status code and headers being optional). If its former (i.e, string), api gateway consider the value http response body with default status code of 200 and no headers. The latter (record) is directly mapped to http body, status code and content type headers.

And thi flexibility that we currently have exists more of a confusion than a feature. Example: What happens if I want to return a http body which itself contains a status field? There are n possibilities and it's hard to teach users whats going on.

Hence we mandate statically that the last expression should be a Record - always. Yes, it reduces some flexibility but it doesn't limit the user from doing anything

The last expression in a Rib in an API gateway http definition is always

{body: ...}

or

{status: 200, body: ..}

or

{status: 200, body:.., headers: {...}

Here we reliably we inspect the values in headers and pick the Content-Type which is probably an Expr::string(..)

An intentionally complex example:

let x = request.body.user;
let worker = instance();
let result = worker.foo(x);

let status_and_body = match result {
   ok(value) => {status: 200, body: "${value}"},
   err(value) => {status: 400, body : "failed"}
}

{status: status_and_body.status, body: status_and_body.body, headers: {ContentType:...}} 

@afsalthaj
Copy link
Copy Markdown
Contributor

@Nanashi-lab if we need to track down the types of response for each type of status (which is something I overlooked in terms of the details when the problem was originally described) then I believe John's suggestion of singleton types is the way out.

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Jul 1, 2025

@afsalthaj I like your suggestion, of rib output always having {status: 200, body:.., headers: {...}, I can always extract the line, have a simple check against type (since we know that). We dont need response type for each status, as variant is converted into one of type. We dont need to associate status value to actual response for the value, just the variant.

Example 1 -
{status: 200, body: ...}

I should be able to extract 200 from the rib script, and use it in openapi as

 responses:
  '200':
    description: Success

Example 2 -
{status: output_status, body: ...}

I should be able to figure out that status does not have an exact match, and I can use, , this is in addition to a default likely 200, and that should cover all the other outputs.

responses:
  default-response:
   description: Success

Example 3 -
Header can also be figured out similarly, but in the rare occasion, header is dynamic, I can use this in addition to application/json as a fallback

  content:
      */*:

@Nanashi-lab Nanashi-lab force-pushed the openapi branch 3 times, most recently from 125c3f5 to 6df8940 Compare July 11, 2025 13:52
@Nanashi-lab Nanashi-lab marked this pull request as ready for review July 11, 2025 14:52
@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Jul 12, 2025

I am finished with the merge, and I have also completed the golem-cli changes

Testing -

You can use the single binary golem, from golem-cli to end to end test swagger ui, export API in openapispec
Compile the golem binary from the my openapi branch of golem-cli.

golem server run

Clone Test components
Navigate to each app folder (e.g., shopping-cart/, todo-list/, llm/)
golem app deploy

You can access the Swagger UI for each app at the endpoints listed below.

Shopping Cart
API:

    shopping-cart/0.0.1
        localhost:9006/v0.0.1/swagger-shopping-cart

Todo List
APIs:

    todo-list1/0.0.1
        localhost:9006/v0.0.1/swagger-todo
    todo-list2/0.0.2
        localhost:9006/v0.0.2/swagger-todo
    simple-todo-list/0.0.1
        localhost:9006/v0.0.1/swagger-simple

LLM
APIs:

    llm/0.0.1
        localhost:9006/v0.0.1/swagger-llm

Setup for LLM

  • LLM test component is from golem-llm, LLM expects the wasm files to be under folder components/debug or components/release
  • You would have to pre-create the worker with the environmental variable, with worker name being test-llm

cc @jdegoes @afsalthaj

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

More Notes -
since it isn't possible to status and content-type for response
for each input we add status: integer, '200' determined by Method, and default, and for content-type we use application/json - schema, */* string
In case the content-type is anything other than application/json if the schema is string, it will ouput the string, or ouput the error that it cannot convert schema to string error in string.
example for a rib which outputs u32

responses:
  '200':
    description: Success
    content:
      application/json:
        type: u32
      '*/*':
        type: string  
  default:
    description: Success
    content:
      application/json:
        type: u32
      '*/*':
        type: string  

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

@afsalthaj whenever you are free, let's get this PR reviewed

DeleteApiDefinition = 14,
DeleteProject = 15,
ViewProject = 161,
ViewProject = 16,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:D

return (schema, "application/json".to_string());
}
}
(None, "application/json".to_string())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this logic is due to the complexity that we discussed previously. I am ok for this for now.

};

// Only add content if we have a response schema and it's not a 204 response
if status_code != 204 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if user specifies a status code 204, and then also some content. Needn't handle it in this PR. But I think proper solution requires literal types. We will deal it later

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of how we currently determine status code, 204 happens only when the method is DELETE.

if let Some(worker_name) = data.worker_name {
binding_info.insert(
"worker-name".to_string(),
serde_json::Value::String(worker_name.worker_name.to_string()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should hardly occur as worker name is none with first class worker support. I know for one of the bindings it will happen. Just typing this here, just so that if you agree or not.

Copy link
Copy Markdown
Contributor Author

@Nanashi-lab Nanashi-lab Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only for file-server and http-handler, both of the can accept workername. we must do this, because original ticket's requirement was roundtrip (httpapi -> openapi -> httpapi via import).
̶I̶n̶ ̶d̶e̶f̶a̶u̶l̶t̶ ̶b̶i̶n̶d̶i̶n̶g̶-̶t̶y̶p̶e̶ ̶t̶h̶i̶s̶ ̶s̶h̶o̶u̶l̶d̶ ̶r̶a̶r̶e̶l̶y̶ ̶o̶c̶c̶u̶r̶,̶ ̶g̶i̶v̶e̶n̶ ̶t̶h̶e̶ ̶f̶i̶r̶s̶t̶ ̶c̶l̶a̶s̶s̶ ̶w̶o̶r̶k̶e̶r̶ ̶s̶u̶p̶p̶o̶r̶t̶.̶ ̶
Edit - in default binding-type this will never happen, as WorkerBindingCompiled doesn't have workername, If someone uses workername in default, there is information loss, and roundtrip will fail. (Rare case, since we have moved to first class worker)

            component_id: Some(&w.component_id),
            worker_name: None, // WorkerBindingCompiled doesn't have worker_name_compiled
            response: Some(&w.response_compiled),

Copy link
Copy Markdown
Contributor

@afsalthaj afsalthaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better than previous PR.
I think this is a decent version to be merged, with possibilities of improvement as we go.
This work is valuable already to get merged.

Thanks @Nanashi-lab

@afsalthaj
Copy link
Copy Markdown
Contributor

afsalthaj commented Aug 6, 2025

@Nanashi-lab Please resolve conflicts as soon as possible, and @vigoo can merge.
I saw the tests, but I will be testing this implementation when I get some time. But good to get this merged before that.

@Nanashi-lab
Copy link
Copy Markdown
Contributor Author

Nanashi-lab commented Aug 7, 2025

@vigoo I have resolved all the conflicts and rebased to latest.

I am unsure about atomic deployment, so currently there is no stub for export api in registry service
(simialr to PR #1884 ), only made minor changes to golem-common/src/api/api_definition (No addition of openapi related struct)

Also code which were part of converting bindingtype to grpc types, is no longer there, so changes to those parts have been left out in this rebase.

@vigoo vigoo merged commit c10907e into golemcloud:main Aug 7, 2025
23 checks passed
@Nanashi-lab Nanashi-lab deleted the openapi branch August 18, 2025 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Develop OpenAPI Export for API Definition

4 participants