Torchpipe

torchpipe is an alternative choice for Triton Inference Server, mainly featuring similar functionalities such as Shared-momory, Ensemble, and BLS mechanism.

For serving scenarios, TorchPipe is designed to support multi-instance deployment, pipeline parallelism, adaptive batching, GPU-accelerated operators, and reduced head-of-line (HOL) blocking.It acts as a bridge between lower-level acceleration libraries (e.g., TensorRT, OpenCV, CVCUDA) and RPC frameworks (e.g., Thrift). At its core, it is an engine that enables programmable scheduling.

update

[20260104] We switched to tvm_ffi to provide clearer C++-Python interaction.

Usage

Below are some usage examples, for more check out the examples.

Initialize and Prepare Pipeline

from torchpipe import pipe
import torch

from torchvision.models.resnet import resnet101

# create some regular pytorch model...
model = resnet101(pretrained=True).eval().cuda()

# create example model
model_path = f"./resnet101.onnx"
x = torch.ones((1, 3, 224, 224)).cuda()
torch.onnx.export(model, x, model_path, opset_version=17,
                    input_names=['input'], output_names=['output'], 
                    dynamic_axes={'input': {0: 'batch_size'},
                                'output': {0: 'batch_size'}})

thread_safe_pipe = pipe({
    "preprocessor": {
        "backend": "S[DecodeTensor,ResizeTensor,CvtColorTensor,SyncTensor]",
        # "backend": "S[DecodeMat,ResizeMat,CvtColorMat,Mat2Tensor,SyncTensor]",
        'instance_num': 2,
        'color': 'rgb',
        'resize_h': '224',
        'resize_w': '224',
        'next': 'model',
    },
    "model": {
        "backend": "SyncTensor[TensorrtTensor]",
        "model": model_path,
        "model::cache": model_path.replace(".onnx", ".trt"),
        "max": '4',
        'batching_timeout': 4,  # ms, timeout for batching
        'instance_num': 2,
        'mean': "123.675, 116.28, 103.53",
        'std': "58.395, 57.120, 57.375",  # merged into trt
    }}
)

Execute

We can execute the returned thread_safe_pipe just like the original PyTorch model, but in a thread-safe manner.

data = {'data': open('/path/to/img.jpg', 'rb').read()}
thread_safe_pipe(data) # <-- this is thread-safe
result = data['result']

Setup

Note: compiling torchpipe depends on the TensorRT C++ API. Please follow the TensorRT Installation Guide. You may also try installing torchpipe inside one of the NGC PyTorch docker containers(e.g. nvcr.io/nvidia/pytorch:25.05-py3).

Installation

To install the torchpipe Python library, call the following

Inside NGC Docker Containers

test on 25.05, 24.05, 23.05, and 22.12

git clone https://github.com/torchpipe/torchpipe.git
cd torchpipe/

img_name=nvcr.io/nvidia/pytorch:25.05-py3

docker run --rm --gpus all -it --rm --network host \
    -v $(pwd):/workspace/ --privileged \
    -w /workspace/ \
    $img_name \
    bash

# pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

cd /workspace/plugins/torchpipe && python setup.py install --cv2

For other NGC docker containers.

How does it work?

See Basic Usage.

How to add (or override) a backend

WIP

Version Migration Notes

TorchPipe (v1, this version) is a collection of deep learning computation backends built on Omniback library. Not all computation backends from TorchPipe (v0) have been ported to TorchPipe (v1) yet.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
3rdparty		3rdparty
docker		docker
docs		docs
include/omniback		include/omniback
plugins		plugins
python/omniback		python/omniback
scripts		scripts
src/omniback		src/omniback
tests		tests
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
zensical.toml		zensical.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Torchpipe

update

Usage

Initialize and Prepare Pipeline

Execute

Setup

Installation

Inside NGC Docker Containers

test on 25.05, 24.05, 23.05, and 22.12

How does it work?

How to add (or override) a backend

Version Migration Notes

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

torchpipe/torchpipe

Folders and files

Latest commit

History

Repository files navigation

Torchpipe

update

Usage

Initialize and Prepare Pipeline

Execute

Setup

Installation

Inside NGC Docker Containers

test on 25.05, 24.05, 23.05, and 22.12

How does it work?

How to add (or override) a backend

Version Migration Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages