feat(vision): Add image URL support for vision models#29
Merged
Conversation
Implemented comprehensive vision support for multimodal messages: - Updated OpenAPI spec to latest version with vision types - Generated new types including ContentPart, ImageContentPart, TextContentPart, ImageURL - Added helper functions for easy message creation: - NewTextMessage() for backward-compatible text messages - NewVisionMessage() for multimodal messages - NewTextContentPart() and NewImageContentPart() for content parts - Comprehensive test coverage with vision_test.go - Updated existing tests to work with new Message_Content type - Added vision example in examples/vision/main.go - Updated README with vision documentation and examples - Maintained backward compatibility with A2A types in types.go Breaking change: Message.Content is now Message_Content type instead of string. Use helper functions or .FromMessageContent0() / .AsMessageContent0() methods. Supports: - Image URLs (https://) - Base64-encoded images (data:image/...) - Multiple images per message - Image detail levels (auto, low, high) Closes #26 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Eden Reich <edenreich@users.noreply.github.com>
Signed-off-by: Eden Reich <eden.reich@gmail.com>
It will be re-implemented very soon. Signed-off-by: Eden Reich <eden.reich@gmail.com>
Signed-off-by: Eden Reich <eden.reich@gmail.com>
Add NewMessageContent generic helper function to support both string and []ContentPart types for Message.Content field. This provides backward compatibility following the OpenAI SDK pattern using Go generics. Also rename NewVisionMessage to NewImageMessage for better clarity. Changes: - Add NewMessageContent[T string | []ContentPart] helper in types.go - Update all tests to use NewMessageContent - Update all examples (generation, stream, tools, middleware-bypass, stream-tools, vision) - Update README.md with new helper usage - Rename NewVisionMessage to NewImageMessage for better API clarity All tests pass successfully. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Refactor the models example to use a slice of providers instead of repetitive code blocks. This makes the code more maintainable and demonstrates cleaner patterns. Also fixes context timeout issue where sequential requests using a shared context would timeout. Each provider request now gets its own fresh context with a 10-second timeout. Changes: - Replace repetitive provider blocks with a loop over provider slice - Create new context for each request to prevent timeout issues - Add proper context cleanup with cancel() calls - Reduce code from ~80 lines to ~40 lines Fixes Ollama Cloud models fetching that was failing with "context deadline exceeded". 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
edenreich
commented
Nov 20, 2025
github-actions bot
pushed a commit
that referenced
this pull request
Nov 20, 2025
## [1.14.0](v1.13.0...v1.14.0) (2025-11-20) ### ⚠ BREAKING CHANGES * **vision:** Message.Content is now Message_Content type instead of string. Use helper functions or .FromMessageContent0() / .AsMessageContent0() methods. Supports: - Image URLs (https://) - Base64-encoded images (data:image/...) - Multiple images per message - Image detail levels (auto, low, high) ### ✨ Features * **provider:** Add support for Ollama Cloud ([#31](#31)) ([31aaaf8](31aaaf8)) * **vision:** Add image URL support for vision models ([#29](#29)) ([b4cc118](b4cc118)), closes [#26](#26) ### 👷 CI * Update Claude Code CI ([#27](#27)) ([babade6](babade6)) ### 🔧 Miscellaneous * Delete .github/workflows/claude-code-review.yml ([#28](#28)) ([9b75d24](9b75d24))
|
🎉 This PR is included in version 1.14.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implemented comprehensive vision support for multimodal messages to enable requests to vision-capable models like GPT-4 Vision, along with backward compatibility improvements and example refinements.
Changes
Vision Support
ContentPart,ImageContentPart,TextContentPart,ImageURLNewTextMessage(role, text)- Create simple text messagesNewImageMessage(role, parts)- Create multimodal messages with imagesNewTextContentPart(text)- Create text content partsNewImageContentPart(imageURL, detail)- Create image content partsexamples/vision/main.goBackward Compatibility
NewMessageContent[T string | []ContentPart](value T)generic helperstringand[]ContentPartContent: sdk.NewMessageContent("Hello world")NewVisionMessagetoNewImageMessagefor better clarityExample Improvements
cancel()callsOther Changes
SkipMCPandDirectProviderBreaking Changes
Message.Contentis nowMessage_Contentunion type instead ofstringMigration:
Supported Features
NewMessageContent)Testing
All tests pass successfully:
All examples compile and run:
examples/generation/- Basic text generationexamples/stream/- Streaming responsesexamples/tools/- Function callingexamples/vision/- Vision/multimodal messagesexamples/models/- List provider modelsexamples/middleware-bypass/- Middleware controlCloses #26
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com