Publications
Decoupling Complexity from Scale in Latent Diffusion Model
arXiv preprint arXiv
VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption
Conference and Workshop on Neural Information Processing Systems (NeurIPS), 2025
VIVID-10M: A Dataset and Baseline for Versatile and Interactive Video Local Editing
arXiv preprint arXiv
QaVA: Query-aware Video Analysis Framework Based on Data Access Pattern
IEEE International Conference on Data Engineering (ICDE),2025
TVM: A Tile-based Video Management Framework
International Conference on Very Large Data Bases (VLDB),2024
Learning based Multi-modality Image and Video Compression
IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2022
GTMS: A Gradient-driven Tree-guided Mask-free Referring Image Segmentation Method
European Conference on Computer Vision (ECCV),2024
Preprocessing Enhanced Image Compression for Machine Vision
IEEE Transactions on Circuits and Systems for Video Technology
A Transformer based deep conditional video compression
Journal of Beijing University of Aeronautics and Astronautics (JBUAA)
