Skip to content

feat(vision): Add image URL support for vision models#29

Merged
edenreich merged 7 commits intomainfrom
claude/issue-26-20251119-1323
Nov 20, 2025
Merged

feat(vision): Add image URL support for vision models#29
edenreich merged 7 commits intomainfrom
claude/issue-26-20251119-1323

Conversation

@edenreich
Copy link
Copy Markdown
Contributor

@edenreich edenreich commented Nov 19, 2025

Summary

Implemented comprehensive vision support for multimodal messages to enable requests to vision-capable models like GPT-4 Vision, along with backward compatibility improvements and example refinements.

Changes

Vision Support

  • Updated OpenAPI spec to latest version with vision types
  • Generated new types: ContentPart, ImageContentPart, TextContentPart, ImageURL
  • Added helper functions for easy message creation:
    • NewTextMessage(role, text) - Create simple text messages
    • NewImageMessage(role, parts) - Create multimodal messages with images
    • NewTextContentPart(text) - Create text content parts
    • NewImageContentPart(imageURL, detail) - Create image content parts
  • Comprehensive test coverage with vision_test.go
  • Added vision example in examples/vision/main.go
  • Updated README with vision documentation

Backward Compatibility

  • Added NewMessageContent[T string | []ContentPart](value T) generic helper
    • Uses Go generics to accept both string and []ContentPart
    • Follows OpenAI SDK pattern for intuitive API
    • Example: Content: sdk.NewMessageContent("Hello world")
  • Updated all tests and examples to use the new helper
  • Renamed NewVisionMessage to NewImageMessage for better clarity

Example Improvements

  • Refactored models example to use provider loop instead of repetitive code
  • Fixed context timeout issues by creating fresh contexts for each request
  • Reduced models example from ~80 lines to ~40 lines
  • Proper context cleanup with cancel() calls

Other Changes

  • Removed A2A-related middleware (replaced with MCP)
  • Updated middleware options to use SkipMCP and DirectProvider
  • Maintained backward compatibility with existing code patterns

Breaking Changes

⚠️ Message.Content is now Message_Content union type instead of string

Migration:

// Old (broken):
Message{Role: User, Content: "Hello"}

// New (works):
Message{Role: User, Content: sdk.NewMessageContent("Hello")}

// Or use the helper:
msg, err := sdk.NewTextMessage(sdk.User, "Hello")

Supported Features

  • ✅ Image URLs (https://)
  • ✅ Base64-encoded images (data:image/...)
  • ✅ Multiple images per message
  • ✅ Image detail levels (auto, low, high)
  • ✅ String content (via NewMessageContent)
  • ✅ Multimodal content arrays

Testing

All tests pass successfully:

task test  # All tests pass

All examples compile and run:

  • examples/generation/ - Basic text generation
  • examples/stream/ - Streaming responses
  • examples/tools/ - Function calling
  • examples/vision/ - Vision/multimodal messages
  • examples/models/ - List provider models
  • examples/middleware-bypass/ - Middleware control

Closes #26

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

claude bot and others added 6 commits November 19, 2025 13:34
Implemented comprehensive vision support for multimodal messages:

- Updated OpenAPI spec to latest version with vision types
- Generated new types including ContentPart, ImageContentPart, TextContentPart, ImageURL
- Added helper functions for easy message creation:
  - NewTextMessage() for backward-compatible text messages
  - NewVisionMessage() for multimodal messages
  - NewTextContentPart() and NewImageContentPart() for content parts
- Comprehensive test coverage with vision_test.go
- Updated existing tests to work with new Message_Content type
- Added vision example in examples/vision/main.go
- Updated README with vision documentation and examples
- Maintained backward compatibility with A2A types in types.go

Breaking change: Message.Content is now Message_Content type instead of string.
Use helper functions or .FromMessageContent0() / .AsMessageContent0() methods.

Supports:
- Image URLs (https://)
- Base64-encoded images (data:image/...)
- Multiple images per message
- Image detail levels (auto, low, high)

Closes #26

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Eden Reich <edenreich@users.noreply.github.com>
Signed-off-by: Eden Reich <eden.reich@gmail.com>
It will be re-implemented very soon.

Signed-off-by: Eden Reich <eden.reich@gmail.com>
Signed-off-by: Eden Reich <eden.reich@gmail.com>
Add NewMessageContent generic helper function to support both string
and []ContentPart types for Message.Content field. This provides
backward compatibility following the OpenAI SDK pattern using Go generics.

Also rename NewVisionMessage to NewImageMessage for better clarity.

Changes:
- Add NewMessageContent[T string | []ContentPart] helper in types.go
- Update all tests to use NewMessageContent
- Update all examples (generation, stream, tools, middleware-bypass, stream-tools, vision)
- Update README.md with new helper usage
- Rename NewVisionMessage to NewImageMessage for better API clarity

All tests pass successfully.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Refactor the models example to use a slice of providers instead of
repetitive code blocks. This makes the code more maintainable and
demonstrates cleaner patterns.

Also fixes context timeout issue where sequential requests using a
shared context would timeout. Each provider request now gets its own
fresh context with a 10-second timeout.

Changes:
- Replace repetitive provider blocks with a loop over provider slice
- Create new context for each request to prevent timeout issues
- Add proper context cleanup with cancel() calls
- Reduce code from ~80 lines to ~40 lines

Fixes Ollama Cloud models fetching that was failing with
"context deadline exceeded".

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@edenreich edenreich merged commit b4cc118 into main Nov 20, 2025
1 check passed
@edenreich edenreich deleted the claude/issue-26-20251119-1323 branch November 20, 2025 10:45
github-actions bot pushed a commit that referenced this pull request Nov 20, 2025
## [1.14.0](v1.13.0...v1.14.0) (2025-11-20)

### ⚠ BREAKING CHANGES

* **vision:** Message.Content is now Message_Content type instead of string.
Use helper functions or .FromMessageContent0() / .AsMessageContent0() methods.

Supports:
- Image URLs (https://)
- Base64-encoded images (data:image/...)
- Multiple images per message
- Image detail levels (auto, low, high)

### ✨ Features

* **provider:** Add support for Ollama Cloud ([#31](#31)) ([31aaaf8](31aaaf8))
* **vision:** Add image URL support for vision models ([#29](#29)) ([b4cc118](b4cc118)), closes [#26](#26)

### 👷 CI

* Update Claude Code CI ([#27](#27)) ([babade6](babade6))

### 🔧 Miscellaneous

* Delete .github/workflows/claude-code-review.yml ([#28](#28)) ([9b75d24](9b75d24))
@github-actions
Copy link
Copy Markdown

🎉 This PR is included in version 1.14.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add image url type of message content to support requests to vision models

1 participant