Swama v1.4.0 Release Notes

@zhaopengme

What's Changed

update commit version by @zhaopengme in #100
Embeddings batch by @sxy-trans-n in #101
Qwen35 by @sxy-trans-n in #103
fix dep by @sxy-trans-n in #104

Full Changelog: v2.0.1...v2.1.0

@sxy-trans-n

What's Changed

initialize cache limit configuration in ModelPool by @sxy-trans-n in #98
Add 1M context limit option to Mac app menu by @Copilot in #97
Add URI normalization for request handling in HTTPHandler by @sxy-trans-n in #99

New Contributors

@Copilot made their first contribution in #97

Full Changelog: v2.0.0...v2.0.1

What's Changed

Replace mlx-swift-example with mlx-swift-lm
Replace whisper-kit with mlx-swift-audio
Migrate mlx_embeddings to MLXEmbedders
Add experimental TTS endpoint
Add context length limit (configurable via UI or CLI)
Add support for more models (e.g., Qwen3-VL)

Breaking Change (whisper-kit)

If you previously downloaded Whisper models, please delete the old models using Swama 1.5.x, then re-download them in Swama 2.0.0.
This avoids compatibility issues caused by the whisper-kit update.

Full Changelog: v1.5.0...v2.0.0

@sxy-trans-n

What's Changed

fix: ensure tool calls are accumulated in streaming responses by @sxy-trans-n in #70
Improve clarity by logging existing model in fetchModel by @kakiloki in #72
feat: cors support by @Rin-Li in #69
add gpt-oss by @sxy-trans-n in #79

Full Changelog: v1.4.3...v1.5.0

@nova28

What's Changed

feat: add 'rm' command to remove models from local storage by @nova28 in #65
fix: Make ModelPaths.customModelsDirectory dynamic for tests by @sxy-trans-n in #66
fix: improve OpenAI API compatibility for streaming responses by @sxy-trans-n in #67
Run: add auto-download when model is missing by @kakiloki in #52
feat: add a feature create model from local path by @Rin-Li in #63
add qwen3 30b 2507 alias by @sxy-trans-n in #68

New Contributors

@nova28 made their first contribution in #65

Full Changelog: v1.4.2...v1.4.3

@Rin-Li

What's Changed

Fixed Gemma 3 family support
fix readme problem by @Rin-Li in #58
Modelpool vlm detection optimization by @sxy-trans-n in #61

Full Changelog: v1.4.1...v1.4.2

@sxy-trans-n

What's Changed

Api error message by @sxy-trans-n in #44
Fix: Inability to get file size from HuggingFace downloader by @djx-trans-n in #45
Add feature check wrapper script by @Rin-Li in #47
Add model size info and update multilingual model table descriptions by @kakiloki in #46
fix: tool calls schema by @subnix in #51

New Contributors

@kakiloki made their first contribution in #46
@subnix made their first contribution in #51

Full Changelog: v1.4.0...v1.4.1

@zhaopengme

Swama v1.4.0 Release Notes

🆕 What's New

OpenAI-Compatible Tool Calling Support

Function calling API - Full OpenAI-compatible tool calling functionality for AI models to interact with external functions
Flexible tool selection - Support for all tool choice modes: "none", "auto", "required", and function-specific selection
Streaming & non-streaming - Unified tool call handling for both response modes with real-time tool call chunks via SSE
Complete message support - Support for all message roles including system, user, assistant, and tool
MLX integration - Seamless conversion between OpenAI tool specs and MLX ToolSpec format with automatic parameter handling

Gemma3 Vision-Language Model Support

New model alias - Added gemma3 alias for mlx-community/gemma-3-27b-it-4bit vision-language model
Multimodal inference - Native support for both text and image inputs with easy CLI usage
Server-first architecture - CLI now prioritizes HTTP API calls to Swama.app backend for improved performance
Auto-launch capability - Silently launches Swama.app if not running, with graceful fallback to direct execution
Enhanced CLI options - Added --server-host and --server-port configuration for flexible deployment

ModelScope Registry Support

Dual registry support - Support for both Hugging Face and ModelScope model downloads
Environment configuration - Set SWAMA_REGISTRY=MODEL_SCOPE to use ModelScope, defaults to HUGGING_FACE
China-friendly access - Provides Chinese users with faster and more accessible model downloads
Seamless switching - Same CLI commands work with both registries without code changes
Unified model management - Models from both registries appear in swama list with proper identification

🚀 Usage

Tool Calling

# Tool calling via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Gemma3 Multimodal Inference

# Vision-language model with image input
swama run gemma3 "What's in this image?" -i /path/to/image.jpg

ModelScope Registry

# Use ModelScope registry for Chinese users
export SWAMA_REGISTRY=MODEL_SCOPE
swama pull qwen3

# Or use default Hugging Face registry
export SWAMA_REGISTRY=HUGGING_FACE
swama pull qwen3

# Registry setting persists for the session
swama list  # Shows models from configured registry

📦 Download

Download Swama v1.4.0

Available formats:

brew install swama
DMG installer - Easy drag-and-drop installation for macOS
ZIP archive - Direct application bundle

🔄 Upgrade Notes

If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
Tool calling support: The /v1/chat/completions endpoint now supports OpenAI-compatible tool calling
Gemma3 model: New vision-language model alias available immediately after upgrade
ModelScope support: Set SWAMA_REGISTRY=MODEL_SCOPE environment variable to use ModelScope registry
Server architecture: CLI now uses server-first approach for better performance (existing workflows continue to work)

🔧 Requirements

macOS 14.0+
Apple Silicon (M1/M2/M3/M4)
For vision models: Compatible image formats (JPEG, PNG, etc.)
For ModelScope: Internet connection to ModelScope platform

What's Changed

feat: add SWAMA_PORT environment variable support for port configuration by @zhaopengme in #33
Refactor: change to use SwiftUI.MenuBarExtra by @hitsubunnu in #36
[Feature] Support download from modelscope by @BBBOND in #35
Tool call by @sxy-trans-n in #38
Gemma3 by @sxy-trans-n in #39
Homebrew by @reneleonhardt Homebrew/homebrew-cask#217335

New Contributors

@zhaopengme made their first contribution in #33
@hitsubunnu made their first contribution in #36
@BBBOND made their first contribution in #35

Full Changelog: v1.3.0...v1.4.0

@sxy-trans-n

Swama v1.3.0 Release Notes

🆕 What's New

OpenAI-Compatible Audio API

/v1/audio/transcriptions endpoint - Full OpenAI API compatibility for seamless integration
Multipart form data support - Proper file upload handling with audio format validation
Multiple response formats - JSON, text, and verbose JSON output options
Robust error handling - Comprehensive error responses with proper HTTP status codes

Enhanced CLI with Audio Commands

New transcribe command - Comprehensive audio transcription with customizable options
Enhanced pull command - Unified downloading for both MLX and WhisperKit models
Rich CLI options - Support for model selection, language, temperature, prompts, and output formats
Intelligent model validation - Automatic detection and validation of WhisperKit models

🚀 Usage

Audio Transcription

# Basic transcription
swama transcribe audio.wav

# Specify model and language
swama transcribe audio.wav -m whisper-base -l en

# Get detailed output with timestamps
swama transcribe audio.wav --verbose

# JSON output for programmatic use
swama transcribe audio.wav -f json

# Fine-tune with temperature and prompt
swama transcribe audio.wav -t 0.2 -p "Technical discussion about AI"

WhisperKit Model Management

# Download WhisperKit models
swama pull whisper-tiny
swama pull whisper-base 
swama pull whisper-small
swama pull whisper-large

# List all available models (includes WhisperKit)
swama list

Audio API Integration

# Transcribe via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@meeting.wav" \
  -F "model=whisper-large" \
  -F "language=en" \
  -F "response_format=verbose_json"

# Simple text response
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@audio.wav" \
  -F "model=whisper-large" \
  -F "response_format=text"

📦 Download

Download Swama v1.3.0

Available formats:

DMG installer - Easy drag-and-drop installation for macOS
ZIP archive - Direct application bundle

🔄 Upgrade Notes

If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
New audio capabilities: The transcribe command and audio API are immediately available after upgrade
Model storage: WhisperKit models are stored in ~/.swama/models/whisperkit to keep them organized separately from MLX models
API compatibility: Existing /v1/chat/completions and /v1/embeddings endpoints continue to work unchanged

🔧 Requirements

macOS 14.0+
Apple Silicon (M1/M2/M3/M4)
For audio transcription: Compatible audio formats (WAV recommended, other formats auto-converted)

🎯 Key Benefits

Privacy-focused: All audio processing happens locally on your device
Cost-effective: No API calls to external services for transcription
High performance: Optimized for Apple Silicon with intelligent memory management
Developer-friendly: OpenAI-compatible API for easy integration into existing workflows

What's Changed

Support Audio by @sxy-trans-n in #31

Full Changelog: v1.2.0...v1.3.0

@Rin-Li

Swama v1.2.0 Release Notes

🆕 What's New

Enhanced Model Path Management

New centralized model storage - Models now downloaded to ~/.swama/models by default
Backward compatibility maintained - Existing models in ~/Documents/huggingface/models still work
Intelligent model discovery - Automatically finds models across multiple storage locations
Improved organization - Better structure for different model types

Expanded Vision-Language (VL) Model Support

Comprehensive VL model detection - Smart pattern-based recognition (-VL-, vision, Visual, etc.)
Enhanced model registry - Improved caching and lookup performance for VL models
Dual-path loading - Support for both registry-based and locally stored VL models
Better offline capabilities - Enhanced model discovery for air-gapped environments

Robust Offline Model Support

Local-first workflows - Load models from local directories without network dependency
Multi-location discovery - Intelligent detection across preferred and legacy paths
Air-gapped compatibility - Full offline model resolution for network-constrained environments
Improved local model configuration - Streamlined process for locally stored models

Performance & Memory Optimizations

Smart model type caching - Avoid repeated registry lookups with intelligent caching
Lazy-loaded VLM registry - Better startup performance with on-demand loading
Optimized filesystem operations - Reduced overhead for offline model discovery
Enhanced concurrent loading - Type-aware caching for parallel model processing

🚀 Usage

Load Local Models (Offline)

# Swama automatically discovers models in:
# ~/.swama/models/
# ~/Documents/huggingface/models/
swama list  # Shows all discovered models
swama run your-local-model  # Works without internet

Improved Model Organization

# Models are now organized in the new preferred location
ls ~/.swama/models/
# But legacy models still work
ls ~/Documents/huggingface/models/

📦 Download

Download Swama v1.2.0

Available formats:

DMG installer - Easy drag-and-drop installation for macOS
ZIP archive - Direct application bundle

🔄 Upgrade Notes

If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
Model storage: New models will be downloaded to ~/.swama/models by default, but existing models in the legacy location continue to work seamlessly
VL model users: Enhanced support means better compatibility and performance for vision-language models

🔧 Requirements

macOS 14.0+
Apple Silicon (M1/M2/M3/M4)

What's Changed

Fix the model name in the script by @Rin-Li in #12
Enhanced Model Path Management and VL Model Support by @sxy-trans-n in #20

New Contributors

@Rin-Li made their first contribution in #12

Full Changelog: v1.1.0...v1.2.0

Releases: Trans-N-ai/swama

v2.1.0

What's Changed

Contributors

Uh oh!

v2.0.1

What's Changed

New Contributors

Contributors

Uh oh!

v2.0.0

What's Changed

Breaking Change (whisper-kit)

Uh oh!

v1.5.0

What's Changed

Contributors

Uh oh!

v1.4.3

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.2

What's Changed

Contributors

Uh oh!

v1.4.1

What's Changed

New Contributors

Contributors

Uh oh!

v1.4.0

Swama v1.4.0 Release Notes

🆕 What's New

OpenAI-Compatible Tool Calling Support

Gemma3 Vision-Language Model Support

ModelScope Registry Support

🚀 Usage

Tool Calling

Gemma3 Multimodal Inference

ModelScope Registry

📦 Download

🔄 Upgrade Notes

🔧 Requirements

What's Changed

New Contributors

Contributors

Uh oh!

v1.3.0

Swama v1.3.0 Release Notes

🆕 What's New

OpenAI-Compatible Audio API

Enhanced CLI with Audio Commands

🚀 Usage

Audio Transcription

WhisperKit Model Management

Audio API Integration

📦 Download

🔄 Upgrade Notes

🔧 Requirements

🎯 Key Benefits

What's Changed

Contributors

Uh oh!

v1.2.0

Swama v1.2.0 Release Notes

🆕 What's New

Enhanced Model Path Management

Expanded Vision-Language (VL) Model Support

Robust Offline Model Support

Performance & Memory Optimizations

🚀 Usage

Load Local Models (Offline)

Improved Model Organization

📦 Download

🔄 Upgrade Notes

🔧 Requirements

What's Changed

New Contributors