Skip to content

Differing Interpretations of Protobuf Spec in Java and Go SDKs  #759

@zach-robinson

Description

@zach-robinson

On the use of the text_data and binary_data fields in the protobuf-format spec, it says:

When the type of the data is text, the value MUST be stored in the text_data property.
datacontenttype SHOULD be populated with the appropriate media-type.
When the type of the data is binary the value MUST be stored in the binary_data property.
datacontenttype SHOULD be populated with the appropriate media-type.

The Java SDK's serializer always uses the text_data field when the datacontenttype is application/json, application/xml or text/*. Otherwise the binary_data field is used. However, in the Go SDK's deserializer binary_data is treated literally, while text_data gets serialized using the appropriate serializer for the datacontenttype. When a CloudEvent with a datacontenttype of application/json is serialized into protobuf using the Java SDK and then sent to a service using the Go SDK and deserialized there, this results in the JSON in the text_data field being interpreted as a JSON-String when it should be treated as a JSON-Object. So if I serialize this CloudEvent into protobuf using the Java SDK:

{
    "specversion" : "1.0",
    "type" : "com.example.someevent",
    "source" : "/mycontext",
    "subject": null,
    "id" : "C234-1234-1234",
    "time" : "2018-04-05T17:31:00Z",
    "comexampleextension1" : "value",
    "comexampleothervalue" : 5,
    "datacontenttype" : "application/json",
    "data" : {
        "appinfoA" : "abc",
        "appinfoB" : 123,
        "appinfoC" : true
    }
}

and then send the resulting protobuf object to a service using the Go SDK and deserialize it there back into JSON format, the result is:

{
    "specversion" : "1.0",
    "type" : "com.example.someevent",
    "source" : "/mycontext",
    "subject": null,
    "id" : "C234-1234-1234",
    "time" : "2018-04-05T17:31:00Z",
    "comexampleextension1" : "value",
    "comexampleothervalue" : 5,
    "datacontenttype" : "application/json",
    "data" : "{ \"appinfoA\" : \"abc\", \"appinfoB\" : 123, \"appinfoC\" : true }"
}

This seems to be unintended behavior, but I as far as I can tell, neither the Java nor the Go SDK implementations of the Protobuf format are technically wrong according to the way the spec is written, so I decided to raise this issue here to get clarification. Is this a bug in one or both of these libraries? Should the Protobuf format spec be more specific in how the binary_data and text_data fields should be used? Or is this behavior somehow intended?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions