Skip to content

[diffusion] Postprocess: implement frame interpolation using RIFE#19384

Merged
mickqian merged 14 commits intosgl-project:mainfrom
yyy1000:frame-interpolation
Feb 28, 2026
Merged

[diffusion] Postprocess: implement frame interpolation using RIFE#19384
mickqian merged 14 commits intosgl-project:mainfrom
yyy1000:frame-interpolation

Conversation

@yyy1000
Copy link
Copy Markdown
Contributor

@yyy1000 yyy1000 commented Feb 26, 2026

Motivation

Part of #18327, implement frame interpolation using RIFE v4.22 lite.
User can run with
sglang generate --model-path Wan-AI/Wan2.2-T2V-A14B-Diffusers --prompt "a dog running through a park" --num-frames 81 --enable-frame-interpolation --frame-interpolation-exp 1

Modifications

Add a new module in runtime directory, port RIFE implementation and run it after generation.

Accuracy Tests

root@a20053551888:/data/junhao/sglang# ffprobe -v error -select_streams v:0 -show_entries stream=r_frame_rate,nb_frames,duration -of default=noprint_wrappers=1 outputs/a_dog_running_through_a_park_20260226-015722_1cfb09dc.mp4
r_frame_rate=16/1
duration=5.062500
nb_frames=81
root@a20053551888:/data/junhao/sglang# ffprobe -v error -select_streams v:0 -show_entries stream=r_frame_rate,nb_frames,duration -of default=noprint_wrappers=1 outputs/a_dog_running_through_a_park_20260226-013402_27b16648.mp4 
r_frame_rate=32/1
duration=5.031250
nb_frames=161

Before frame interpolation

before-frame-interpolation.mp4

After frame interpolation

after-frame-interploation.mp4

Benchmarking and Profiling

N/A

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added the diffusion SGLang Diffusion label Feb 26, 2026
Comment thread python/sglang/multimodal_gen/runtime/entrypoints/utils.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/frame_interpolation/rife_interpolator.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/frame_interpolation/rife_interpolator.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/frame_interpolation/rife_interpolator.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/frame_interpolation/rife_interpolator.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/entrypoints/http_server.py
@Prozac614
Copy link
Copy Markdown
Contributor

You need to add test to the CI

@Prozac614
Copy link
Copy Markdown
Contributor

could you add side-by-side video to this PR

Copy link
Copy Markdown
Collaborator

@mickqian mickqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well done! a few question though:

  1. Is this gpu-intensive? if so, will it be faster if this is done as a separate PipelineStage?
  2. if it's CPU-intensive, will it increase cpu overhead dramatically, thus degrading the throughput?

Comment thread python/sglang/multimodal_gen/runtime/entrypoints/utils.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/postprocess/__init__.py
@yyy1000
Copy link
Copy Markdown
Contributor Author

yyy1000 commented Feb 26, 2026

Thank you so much for the review ! @Prozac614 I already addressed your comments.

For this one

You need to add test to the CI

Do we need add CI for now or shall we add CI after this feature is stable?

@yyy1000
Copy link
Copy Markdown
Contributor Author

yyy1000 commented Feb 26, 2026

Thank you @mickqian for the review!

For this comment:

well done! a few question though:

  1. Is this gpu-intensive? if so, will it be faster if this is done as a separate PipelineStage?
  2. if it's CPU-intensive, will it increase cpu overhead dramatically, thus degrading the throughput?

I think it's gpu-intensive. Probably this could be faster as a separate PipelineStage, but I see comparing to the whole video generation, the frame-interpolation process is super fast, like 5s for frame-interpolation vs around 1000s for DenoisingStage, so the benefit is not too large in my opinion?

@Prozac614
Copy link
Copy Markdown
Contributor

Thank you so much for the review ! @Prozac614 I already addressed your comments.

For this one

You need to add test to the CI

Do we need add CI for now or shall we add CI after this feature is stable?

Add CI before we merge this PR

Comment thread python/sglang/multimodal_gen/runtime/postprocess/rife_interpolator.py Outdated
Comment thread python/sglang/multimodal_gen/runtime/postprocess/rife_interpolator.py Outdated
Comment thread python/sglang/multimodal_gen/configs/sample/sampling_params.py Outdated
Comment thread python/sglang/multimodal_gen/configs/sample/sampling_params.py Outdated
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Feb 26, 2026
@ping1jing2
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

Comment thread docs/diffusion/api/cli.md Outdated
Comment thread python/sglang/multimodal_gen/configs/sample/sampling_params.py Outdated
Comment thread python/sglang/multimodal_gen/test/server/testcase_configs.py Outdated
Comment thread python/sglang/multimodal_gen/test/server/testcase_configs.py Outdated
@A1c0r-Z
Copy link
Copy Markdown

A1c0r-Z commented Feb 27, 2026

I noticed that the RIFE model is currently executed on the API server side. While RIFE is relatively lightweight and should be fine for now, if we plan to introduce heavier post-processing models in the future, would it be more appropriate to integrate them into the actual GPU worker backend?

Additionally, by keeping model execution on the API server, is there a potential risk of triggering OOM under extreme concurrency scenarios.

UPDATE: RIFE model execution on the API server only occurs when return_file_paths_only == False, should we integrate this into GPU worker?

@yyy1000
Copy link
Copy Markdown
Contributor Author

yyy1000 commented Feb 27, 2026

I noticed that the RIFE model is currently executed on the API server side. While RIFE is relatively lightweight and should be fine for now, if we plan to introduce heavier post-processing models in the future, would it be more appropriate to integrate them into the actual GPU worker backend?

Additionally, by keeping model execution on the API server, is there a potential risk of triggering OOM under extreme concurrency scenarios.

Thank you @A1c0r-Z for the review! I just checked RIFE actually runs on the GPU worker side in the default (and almost all practical) configurations, not on the API server.
The key is return_file_paths_only, which defaults to True. When enabled, the GPU worker calls save_outputs() → post_process_sample() (which includes RIFE interpolation) → saves the video to disk, then returns only the file path string over ZMQ. The API server simply reads the file and encodes it to base64 — it never touches RIFE. The API-server-side save_outputs() calls are fallback paths.

@mickqian mickqian merged commit 53c767d into sgl-project:main Feb 28, 2026
64 of 66 checks passed
@mickqian
Copy link
Copy Markdown
Collaborator

nicely done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

diffusion SGLang Diffusion documentation Improvements or additions to documentation run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants