Segment Anything Model 3 β Native Apple Silicon Implementation
A high-performance MLX port of Meta's SAM3 for interactive image segmentation on Mac
| π New to SAM3? Check out the accompanying blog post where I explain the SAM3 architecture, how it works, and what makes it special: Understanding SAM3 β |
- π Native Apple Silicon β Optimized for M1/M2/M3/M4 chips using MLX
- π Text Prompts β Segment objects by describing them ("car", "person", "dog")
- π¦ Box Prompts β Draw bounding boxes to include or exclude regions
- π¨ Interactive Studio β Beautiful web interface for real-time segmentation
- π Python API β Simple programmatic access for scripting and integration
- β¬οΈ Auto Model Download β Weights automatically fetched from HuggingFace
![]() |
![]() |
| Object detection with "car" prompt | Semantic segmentation with "coat" prompt |
SAM3 Studio β Interactive segmentation with text and box prompts
| Requirement | Version | Notes |
|---|---|---|
| macOS | 13.0+ | Apple Silicon required (M1/M2/M3/M4) |
| Python | 3.13+ | Required for MLX compatibility |
| Node.js | 18+ | For the web interface |
| uv | Latest | Optional but recommended β Install uv |
β οΈ Apple Silicon Only: This project uses MLX, Apple's machine learning framework optimized exclusively for Apple Silicon.
If you have uv installed:
# Clone the repository
git clone https://github.com/Deekshith-Dade/mlx-sam3.git
cd mlx-sam3
# Install project dependencies
uv sync
# Launch the app (backend + frontend)
cd app && ./run.shThe first run will automatically download MLX weights from mlx-community/sam3-image (~3.5GB).
Access the app:
- π Frontend: http://localhost:3000
- π API: http://localhost:8000
- π API Docs: http://localhost:8000/docs
Press Ctrl+C to stop all servers.
Click to expand manual setup instructions
# Clone the repository
git clone https://github.com/your-username/mlx-sam3.git
cd mlx-sam3
# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install the package
pip install -e .cd app/backend
pip install -r requirements.txt
python main.pyThe backend will start on http://localhost:8000
cd app/frontend
npm install
npm run devThe frontend will start on http://localhost:3000
Use SAM3 directly in your Python scripts:
from PIL import Image
from sam3 import build_sam3_image_model
from sam3.model.sam3_image_processor import Sam3Processor
# Load model (auto-downloads MLX weights from mlx-community/sam3-image)
model = build_sam3_image_model()
processor = Sam3Processor(model, confidence_threshold=0.5)
# Load and process an image
image = Image.open("your_image.jpg")
state = processor.set_image(image)
# Segment with text prompt
state = processor.set_text_prompt("person", state)
# Access results
masks = state["masks"] # Binary segmentation masks
boxes = state["boxes"] # Bounding boxes [x0, y0, x1, y1]
scores = state["scores"] # Confidence scores
print(f"Found {len(scores)} objects")# Add a box prompt (normalized coordinates: center_x, center_y, width, height)
# label=True for inclusion, label=False for exclusion
state = processor.add_geometric_prompt(
box=[0.5, 0.5, 0.3, 0.3], # Center of image, 30% width/height
label=True,
state=state
)# Clear all prompts while keeping the image
processor.reset_all_prompts(state)
# Try a different prompt
state = processor.set_text_prompt("car", state)mlx-sam3/
βββ sam3/ # Core MLX SAM3 implementation
β βββ model/ # Model components
β β βββ sam3_image.py # Main model architecture
β β βββ vitdet.py # Vision Transformer backbone
β β βββ text_encoder_ve.py # Text encoder
β β βββ ...
β βββ model_builder.py # Model construction utilities
β βββ convert.py # Weight conversion from PyTorch
β βββ utils.py # Helper utilities
βββ app/ # Web application
β βββ backend/ # FastAPI server
β βββ frontend/ # Next.js React app
β βββ run.sh # One-command launcher
βββ assets/ # Static assets & test images
βββ examples/ # Jupyter notebook examples
βββ pyproject.toml # Project configuration
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Check if the model is loaded and ready |
/upload |
POST | Upload an image and create a session |
/segment/text |
POST | Segment using a text prompt |
/segment/box |
POST | Add a box prompt (include/exclude) |
/reset |
POST | Clear all prompts for a session |
/session/{id} |
DELETE | Delete a session and free memory |
# Upload an image
curl -X POST "http://localhost:8000/upload" \
-F "file=@your_image.jpg"
# Response: {"session_id": "abc-123", "width": 1920, "height": 1080, ...}
# Segment with text
curl -X POST "http://localhost:8000/segment/text" \
-H "Content-Type: application/json" \
-d '{"session_id": "abc-123", "prompt": "car"}'Jupyter notebooks are available in the examples/ directory:
sam3_image_predictor_example.ipynbβ Basic image segmentationsam3_image_interactive.ipynbβ Interactive prompting workflows
Run them with:
cd examples
jupyter notebook| Component | Technology |
|---|---|
| ML Framework | MLX |
| Backend | FastAPI, Uvicorn |
| Frontend | Next.js 16, React 19, Tailwind CSS 4 |
| Model | SAM3 MLX |
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the Apache 2.0 License β see the LICENSE file for details.
- Meta AI for the original SAM3 model
- MLX Team at Apple for the incredible ML framework
- The open-source community for continuous inspiration
Built with β€οΈ for Apple Silicon

