A number of LLM providers support "inline images" rather than just URLs. We should support this directly, both as input into golem-llm functions, and as output, where relevant.
Note that the natural representation for an inline image would be list<u8>, otherwise known as a "byte array".
Because URLs can (and should?) still be supported, this means using a variant type to encode the two possibilities.
In general, I think, the image binary data will need to be base 64 encoded for these JSON APIs, but that's a detail that can be handled inside the implementation.
This ticket is not just to change the WIT defining golem-llm but to update every implementation with proper support.
A number of LLM providers support "inline images" rather than just URLs. We should support this directly, both as input into
golem-llmfunctions, and as output, where relevant.Note that the natural representation for an inline image would be
list<u8>, otherwise known as a "byte array".Because URLs can (and should?) still be supported, this means using a variant type to encode the two possibilities.
In general, I think, the image binary data will need to be base 64 encoded for these JSON APIs, but that's a detail that can be handled inside the implementation.
This ticket is not just to change the WIT defining
golem-llmbut to update every implementation with proper support.