Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval

Authors: Jing Zhang, Zhikai Li✉, Xuewen Liu, Qingyi Gu✉

(✉ denotes corresponding author.)

Intruduction

This repository contains the official implementation for the ICLR 2026 paper "Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval".

Overview
Create Environment
Prepare Models
Usage
Main Results
Reference
Acknowledgments

Overview

Motivation

SAM2's perception pattern exhibite computational redundancy. i) The focused attention in mask decoder vs. broad attention span in image encoder shows unnecessary background computation. ii) In memory bank, only a small subset of tokens contribute significantly to memory attention, and the salient regions exhibit temporal consistency.

Method

For image encoder, we introduce object-aware Sparse Window Routing (SWR), which assigns object-irrelevant background windows to a lightweight shortcut branch based on spatial-temporal consistency and perceptual saliency of the object, thus reducing encoding redundancy. For memory attention, we propose object-aware Sparse Memory Retrieval (SMR), which builds a FIFO mask queue to retrieval most salient memory tokens, in which the saliency patterns are reused from their first recollection, thereby reducing the computational cost.

Performance

Efficient-SAM2 wins a well-balanced accuracy–speed trade-off.

Create Environment

Prerequisites

The code requires python>=3.10, as well as torch>=2.5.1 and torchvision>=0.20.1. Please follow the instructions here to install both PyTorch and TorchVision dependencies. You can install SAM 2 on a GPU machine using:

git clone https://github.com/jingjing0419/Efficient-SAM2.git
cd sam3
pip install -e .

To use the SAM 2 predictor and run the example notebooks, jupyter and matplotlib are required and can be installed by:

pip install -e ".[notebooks]"

Prepare Models

All the model checkpoints can be downloaded by running:

cd checkpoints && \
./download_ckpts.sh && \
cd ..

or individually from:

Usage

Train Bypass

python tools/train_bypass_all.py \
    --apply_bypass \
    --apply_WB \
    --use_wandb \
    --train_epoch=5 \
    --train_step=32 \
    --lr=1e-4 \
    --base_video_dir=<PATH-TO-TRAINING-IMAGES> \   
    --input_mask_dir=<PATH-TO-TRAINING-ANNOTATION> \
    --video_list_file=./train_sel_v1.txt \
    --output_mask_dir=./outputs/SAV_train/sav_train_pred_pngs \
    --dataset='sav_train' \
    --sam2_model='base+' \
    --bypass_type='bottleneck'

Inference

The vos_inference_main.py script can be used to generate predictions for semi-supervised video object segmentation (VOS) evaluation on datasets such as DAVIS, MOSE or the SA-V dataset.

After installing SAM 2 and its dependencies, it can be used as follows (DAVIS 2017 dataset as an example). This script saves the prediction PNG files to the --output_mask_dir.

Run Efficient-SAM2 inference:

python tools/vos_inference_main.py \
--sam2_model='base+' --Mem_stride=1 --dataset='SAV_test' \
--apply_bypass --apply_WB --dilate_mask --WB_theta=0.7 \
--bypass_ckpt_base='./bypass/ckpt/bypass_bottleneck_base.pth' \
--prune_memory --topk_mask --set_drop_ratio=0.95 \
--output_mask_dir='./outputs2/'

Evaluation

Run SA-V evaluation:

python sav_evaluator.py \
--gt_root <PATH-TO-SAV-TEST/VAL-DATASET-GROUNDTRUTH> \
--pred_root <PATH-TO-MODEL-OUTPUT>

Star this repository if you find it helpful!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
assets		assets
build_model		build_model
bypass		bypass
demo		demo
notebooks		notebooks
sam2		sam2
sav_dataset		sav_dataset
tools		tools
training		training
.clang-format		.clang-format
.gitignore		.gitignore
.watchmanconfig		.watchmanconfig
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONFIGURATION.md		CONFIGURATION.md
CONTRIBUTING.md		CONTRIBUTING.md
FLOPs.py		FLOPs.py
INSTALL.md		INSTALL.md
LICENSE		LICENSE
LICENSE_cctorch		LICENSE_cctorch
MANIFEST.in		MANIFEST.in
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
backend.Dockerfile		backend.Dockerfile
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
setup.py		setup.py
train_sel_v1.txt		train_sel_v1.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval

Intruduction

Overview

Motivation

Method

Performance

Create Environment

Prerequisites

Prepare Models

Usage

Train Bypass

Inference

Evaluation

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Efficient-SAM2: Accelerating SAM2 with Object-Aware Visual Encoding and Memory Retrieval

Intruduction

Overview

Motivation

Method

Performance

Create Environment

Prerequisites

Prepare Models

Usage

Train Bypass

Inference

Evaluation

About

Resources

License

Licenses found

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages