Skip to content

Releases: Trans-N-ai/swama

v2.1.0

03 Mar 06:49
a7021a4

Choose a tag to compare

What's Changed

Full Changelog: v2.0.1...v2.1.0

v2.0.1

19 Jan 03:44
621ac43

Choose a tag to compare

What's Changed

  • initialize cache limit configuration in ModelPool by @sxy-trans-n in #98
  • Add 1M context limit option to Mac app menu by @Copilot in #97
  • Add URI normalization for request handling in HTTPHandler by @sxy-trans-n in #99

New Contributors

  • @Copilot made their first contribution in #97

Full Changelog: v2.0.0...v2.0.1

v2.0.0

31 Dec 14:09
e0fa75a

Choose a tag to compare

What's Changed

  • Replace mlx-swift-example with mlx-swift-lm
  • Replace whisper-kit with mlx-swift-audio
  • Migrate mlx_embeddings to MLXEmbedders
  • Add experimental TTS endpoint
  • Add context length limit (configurable via UI or CLI)
  • Add support for more models (e.g., Qwen3-VL)

Breaking Change (whisper-kit)

If you previously downloaded Whisper models, please delete the old models using Swama 1.5.x, then re-download them in Swama 2.0.0.
This avoids compatibility issues caused by the whisper-kit update.

Full Changelog: v1.5.0...v2.0.0

v1.5.0

19 Sep 09:01
a200193

Choose a tag to compare

What's Changed

Full Changelog: v1.4.3...v1.5.0

v1.4.3

30 Jul 09:23
fe0675b

Choose a tag to compare

What's Changed

  • feat: add 'rm' command to remove models from local storage by @nova28 in #65
  • fix: Make ModelPaths.customModelsDirectory dynamic for tests by @sxy-trans-n in #66
  • fix: improve OpenAI API compatibility for streaming responses by @sxy-trans-n in #67
  • Run: add auto-download when model is missing by @kakiloki in #52
  • feat: add a feature create model from local path by @Rin-Li in #63
  • add qwen3 30b 2507 alias by @sxy-trans-n in #68

New Contributors

Full Changelog: v1.4.2...v1.4.3

v1.4.2

24 Jul 07:13
2b76c3a

Choose a tag to compare

What's Changed

  • Fixed Gemma 3 family support
  • fix readme problem by @Rin-Li in #58
  • Modelpool vlm detection optimization by @sxy-trans-n in #61

Full Changelog: v1.4.1...v1.4.2

v1.4.1

15 Jul 08:57
0c13aaa

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.4.0...v1.4.1

v1.4.0

27 Jun 06:08
4168df6

Choose a tag to compare

Swama v1.4.0 Release Notes

🆕 What's New

OpenAI-Compatible Tool Calling Support

  • Function calling API - Full OpenAI-compatible tool calling functionality for AI models to interact with external functions
  • Flexible tool selection - Support for all tool choice modes: "none", "auto", "required", and function-specific selection
  • Streaming & non-streaming - Unified tool call handling for both response modes with real-time tool call chunks via SSE
  • Complete message support - Support for all message roles including system, user, assistant, and tool
  • MLX integration - Seamless conversion between OpenAI tool specs and MLX ToolSpec format with automatic parameter handling

Gemma3 Vision-Language Model Support

  • New model alias - Added gemma3 alias for mlx-community/gemma-3-27b-it-4bit vision-language model
  • Multimodal inference - Native support for both text and image inputs with easy CLI usage
  • Server-first architecture - CLI now prioritizes HTTP API calls to Swama.app backend for improved performance
  • Auto-launch capability - Silently launches Swama.app if not running, with graceful fallback to direct execution
  • Enhanced CLI options - Added --server-host and --server-port configuration for flexible deployment

ModelScope Registry Support

  • Dual registry support - Support for both Hugging Face and ModelScope model downloads
  • Environment configuration - Set SWAMA_REGISTRY=MODEL_SCOPE to use ModelScope, defaults to HUGGING_FACE
  • China-friendly access - Provides Chinese users with faster and more accessible model downloads
  • Seamless switching - Same CLI commands work with both registries without code changes
  • Unified model management - Models from both registries appear in swama list with proper identification

🚀 Usage

Tool Calling

# Tool calling via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Gemma3 Multimodal Inference

# Vision-language model with image input
swama run gemma3 "What's in this image?" -i /path/to/image.jpg

ModelScope Registry

# Use ModelScope registry for Chinese users
export SWAMA_REGISTRY=MODEL_SCOPE
swama pull qwen3

# Or use default Hugging Face registry
export SWAMA_REGISTRY=HUGGING_FACE
swama pull qwen3

# Registry setting persists for the session
swama list  # Shows models from configured registry

📦 Download

Download Swama v1.4.0

Available formats:

  • brew install swama
  • DMG installer - Easy drag-and-drop installation for macOS
  • ZIP archive - Direct application bundle

🔄 Upgrade Notes

  • If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
  • Tool calling support: The /v1/chat/completions endpoint now supports OpenAI-compatible tool calling
  • Gemma3 model: New vision-language model alias available immediately after upgrade
  • ModelScope support: Set SWAMA_REGISTRY=MODEL_SCOPE environment variable to use ModelScope registry
  • Server architecture: CLI now uses server-first approach for better performance (existing workflows continue to work)

🔧 Requirements

  • macOS 14.0+
  • Apple Silicon (M1/M2/M3/M4)
  • For vision models: Compatible image formats (JPEG, PNG, etc.)
  • For ModelScope: Internet connection to ModelScope platform

What's Changed

New Contributors

Full Changelog: v1.3.0...v1.4.0

v1.3.0

19 Jun 09:06
da109de

Choose a tag to compare

Swama v1.3.0 Release Notes

🆕 What's New

OpenAI-Compatible Audio API

  • /v1/audio/transcriptions endpoint - Full OpenAI API compatibility for seamless integration
  • Multipart form data support - Proper file upload handling with audio format validation
  • Multiple response formats - JSON, text, and verbose JSON output options
  • Robust error handling - Comprehensive error responses with proper HTTP status codes

Enhanced CLI with Audio Commands

  • New transcribe command - Comprehensive audio transcription with customizable options
  • Enhanced pull command - Unified downloading for both MLX and WhisperKit models
  • Rich CLI options - Support for model selection, language, temperature, prompts, and output formats
  • Intelligent model validation - Automatic detection and validation of WhisperKit models

🚀 Usage

Audio Transcription

# Basic transcription
swama transcribe audio.wav

# Specify model and language
swama transcribe audio.wav -m whisper-base -l en

# Get detailed output with timestamps
swama transcribe audio.wav --verbose

# JSON output for programmatic use
swama transcribe audio.wav -f json

# Fine-tune with temperature and prompt
swama transcribe audio.wav -t 0.2 -p "Technical discussion about AI"

WhisperKit Model Management

# Download WhisperKit models
swama pull whisper-tiny
swama pull whisper-base 
swama pull whisper-small
swama pull whisper-large

# List all available models (includes WhisperKit)
swama list

Audio API Integration

# Transcribe via HTTP API (OpenAI compatible)
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@meeting.wav" \
  -F "model=whisper-large" \
  -F "language=en" \
  -F "response_format=verbose_json"

# Simple text response
curl -X POST http://localhost:28100/v1/audio/transcriptions \
  -F "file=@audio.wav" \
  -F "model=whisper-large" \
  -F "response_format=text"

📦 Download

Download Swama v1.3.0

Available formats:

  • DMG installer - Easy drag-and-drop installation for macOS
  • ZIP archive - Direct application bundle

🔄 Upgrade Notes

  • If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
  • New audio capabilities: The transcribe command and audio API are immediately available after upgrade
  • Model storage: WhisperKit models are stored in ~/.swama/models/whisperkit to keep them organized separately from MLX models
  • API compatibility: Existing /v1/chat/completions and /v1/embeddings endpoints continue to work unchanged

🔧 Requirements

  • macOS 14.0+
  • Apple Silicon (M1/M2/M3/M4)
  • For audio transcription: Compatible audio formats (WAV recommended, other formats auto-converted)

🎯 Key Benefits

  • Privacy-focused: All audio processing happens locally on your device
  • Cost-effective: No API calls to external services for transcription
  • High performance: Optimized for Apple Silicon with intelligent memory management
  • Developer-friendly: OpenAI-compatible API for easy integration into existing workflows

What's Changed

Full Changelog: v1.2.0...v1.3.0

v1.2.0

12 Jun 09:45
fdb4622

Choose a tag to compare

Swama v1.2.0 Release Notes

🆕 What's New

Enhanced Model Path Management

  • New centralized model storage - Models now downloaded to ~/.swama/models by default
  • Backward compatibility maintained - Existing models in ~/Documents/huggingface/models still work
  • Intelligent model discovery - Automatically finds models across multiple storage locations
  • Improved organization - Better structure for different model types

Expanded Vision-Language (VL) Model Support

  • Comprehensive VL model detection - Smart pattern-based recognition (-VL-, vision, Visual, etc.)
  • Enhanced model registry - Improved caching and lookup performance for VL models
  • Dual-path loading - Support for both registry-based and locally stored VL models
  • Better offline capabilities - Enhanced model discovery for air-gapped environments

Robust Offline Model Support

  • Local-first workflows - Load models from local directories without network dependency
  • Multi-location discovery - Intelligent detection across preferred and legacy paths
  • Air-gapped compatibility - Full offline model resolution for network-constrained environments
  • Improved local model configuration - Streamlined process for locally stored models

Performance & Memory Optimizations

  • Smart model type caching - Avoid repeated registry lookups with intelligent caching
  • Lazy-loaded VLM registry - Better startup performance with on-demand loading
  • Optimized filesystem operations - Reduced overhead for offline model discovery
  • Enhanced concurrent loading - Type-aware caching for parallel model processing

🚀 Usage

Load Local Models (Offline)

# Swama automatically discovers models in:
# ~/.swama/models/
# ~/Documents/huggingface/models/
swama list  # Shows all discovered models
swama run your-local-model  # Works without internet

Improved Model Organization

# Models are now organized in the new preferred location
ls ~/.swama/models/
# But legacy models still work
ls ~/Documents/huggingface/models/

📦 Download

Download Swama v1.2.0

Available formats:

  • DMG installer - Easy drag-and-drop installation for macOS
  • ZIP archive - Direct application bundle

🔄 Upgrade Notes

  • If upgrading from a previous version: After installing the new version, open Swama from the menu bar and click "Install Command Line Tool…" to update the CLI tools
  • Model storage: New models will be downloaded to ~/.swama/models by default, but existing models in the legacy location continue to work seamlessly
  • VL model users: Enhanced support means better compatibility and performance for vision-language models

🔧 Requirements

  • macOS 14.0+
  • Apple Silicon (M1/M2/M3/M4)

What's Changed

  • Fix the model name in the script by @Rin-Li in #12
  • Enhanced Model Path Management and VL Model Support by @sxy-trans-n in #20

New Contributors

Full Changelog: v1.1.0...v1.2.0