Skip to content

refactor: Gracefully handle images sent to non-vision models#223

Merged
edenreich merged 2 commits intomainfrom
refactor/image-handling-non-vision-models
Dec 12, 2025
Merged

refactor: Gracefully handle images sent to non-vision models#223
edenreich merged 2 commits intomainfrom
refactor/image-handling-non-vision-models

Conversation

@edenreich
Copy link
Copy Markdown
Contributor

Summary

Changes image handling behavior from returning an error to stripping image content and proceeding with text-only messages. This provides better UX by allowing requests to continue when users accidentally include images with non-vision models.

Changes

1. Behavior Change

  • Previously: Non-vision models would return a 400 error when receiving messages with images
  • Now: Non-vision models strip image content and continue with text-only messages

2. New Method: StripImageContent()

  • Added to providers/message_helpers.go
  • Removes image content from messages while preserving text parts
  • Handles various content formats:
    • String content remains unchanged
    • Array with only text becomes a single string
    • Array with text and images keeps only text
    • Array with only images becomes empty string
    • Array with multiple text parts and images keeps all text parts

3. Updated Route Handler

  • Modified api/routes.go to use the new stripping logic
  • Changed error logging to info logging for image filtering
  • Added image count tracking for better observability
  • Added debug logging when images are stripped

4. Comprehensive Tests

  • Added TestMessage_StripImageContent in tests/multimodal_test.go
  • Tests cover all edge cases and content formats
  • Ensures the stripping logic works correctly

Benefits

  1. Better User Experience: Users don't get errors when accidentally including images with non-vision models
  2. Backward Compatible: Existing text-only requests work exactly the same
  3. Observable: Logs at INFO level when images are filtered, with image counts
  4. Tested: Comprehensive test coverage ensures reliability

Technical Details

  • When EnableVision is true and a non-vision model receives image content:

    • Logs INFO message with provider, model, and image count
    • Strips image content from all messages
    • Continues processing with text-only content
    • Logs DEBUG message confirming images were stripped
  • The StripImageContent() method handles the OpenAPI message format where content can be:

    • A string (text only)
    • An array of content parts (text and/or images)

Testing

Run the tests with:

task test

The new tests verify:

  • String content remains unchanged
  • Text-only arrays become strings
  • Images are properly filtered out
  • Mixed content keeps only text parts
  • Empty content when only images exist

@edenreich edenreich merged commit 06fb970 into main Dec 12, 2025
1 check passed
@edenreich edenreich deleted the refactor/image-handling-non-vision-models branch December 12, 2025 15:04
ig-semantic-release-bot bot added a commit that referenced this pull request Dec 12, 2025
## [0.22.8](v0.22.7...v0.22.8) (2025-12-12)

### ♻️ Improvements

* Gracefully handle images sent to non-vision models ([#223](#223)) ([06fb970](06fb970))

### 👷 CI

* Setup infer workflow ([#222](#222)) ([2c235be](2c235be))

### 🔧 Miscellaneous

* **deps:** Bump claude code to its latest for development ([9c4a7ee](9c4a7ee))
@ig-semantic-release-bot
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 0.22.8 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant