Add multimodal uri, file, and blob parts to GenAI JSON Schemas#2754
Add multimodal uri, file, and blob parts to GenAI JSON Schemas#2754lmolkova merged 20 commits intoopen-telemetry:mainfrom
uri, file, and blob parts to GenAI JSON Schemas#2754Conversation
|
Noting the questions I asked in #1556 (comment), especially:
You've chosen two types: class BlobPart(BaseModel):
type: Literal["blob"] = Field(description="The type of the content captured in this part.")
mime_type: str = Field(description="The IANA MIME type of the attached data.")
data: bytes = Field(description="base64 encoded bytes of the attached data.")
class Config:
extra = "allow"
class FileDataPart(BaseModel):
type: Literal["file_data"] = Field(description="The type of the content captured in this part.")
mime_type: str = Field(description="The IANA MIME type of the attached data.")
file_uri: str = Field(description="A URI referencing to reference attached data. Should be recorded without modification, as it was sent to the model.")
class Config:
extra = "allow"The only difference between these part types is the |
Hadn't considered this before, mainly because I'm more familiar with the Gemini format that inspired this. But a few thoughts
|
|
I see bytes in the protobuf, but not the spec, although byte arrays are allowed: https://opentelemetry.io/docs/specs/otel/logs/data-model/#type-any |
|
Is |
|
I guess that must be right and I misinterpreted. Do we have any precedent/guidance for bytes that can end up inside a complex attribute on a span? Right now it has to convert to JSON, which means base64 encoding, and then to the backend the field will just look like a string. The backend then has to know that it should decode that based on semconv if it wants to end up with the same thing as if it had received the attribute in a log body using protobuf instead of JSON. |
|
cc @lmolkova for the above question |
It's temporary that we serialize to json string, byte array is a good long-term solution and would work nicely with spec changes that are in-flight - open-telemetry/opentelemetry-specification#4651, specifically |
|
One other thing to note, we have this code for non-complex attributes which attempts to interpret python |
Also updated the ipynb to directly write to JSON Schemas to make it easier to update things. This might be easier to convert to a script though and would be easy to add to the Makefile
Co-authored-by: Liudmila Molkova <neskazu@gmail.com>
c8311e6 to
5075167
Compare
|
How should this be instrumented? https://platform.openai.com/docs/guides/images-vision?api-mode=responses&format=base64-encoded#analyze-images response = client.responses.create(
model="gpt-4.1",
input=[
{
"role": "user",
"content": [
{ "type": "input_text", "text": "what's in this image?" },
{
"type": "input_image",
"image_url": f"data:image/jpeg;base64,{base64_image}",
},
],
}
],
)A |
|
My preference would be to generate a @alexmojaki do you have a strong preference for using data URLs? |
|
Not strongly, but it feels like something should be stated in the PR. |
- `FileData.file_uri` -> `FileData.uri` - `Blob.data` -> `Blob.content` - `mime_type` fields optional - added `file_id` field
alexmojaki
left a comment
There was a problem hiding this comment.
Going offline now until the 20th, leaving my approval to unblock this PR technically but please get approval from @Kludex in my place if not another approver's.
uri, file, and blob parts to GenAI JSON Schemas
There was a problem hiding this comment.
@aabmass
It's what we really need. 👍
However, what's the meaning of 'file id' of 'FilePart'? Is it related to the 'file id' in openai and the 'video id' in openai?
Co-authored-by: Liudmila Molkova <neskazu@gmail.com>
Co-authored-by: Liudmila Molkova <neskazu@gmail.com>
Fixes #1556
Changes
Added two new types to the
MessagePartunion for capturing multimodal prompt/response data:BlobPartwhich contains inline base64.FileDataPartwhich contains a URI referencing data.Also updated the ipynb to directly write to the JSON Schemas for simpler updating.
Prototypes
opentelemetry-instrumentation-google-genaiMerge requirement checklist
[chore]