Introducing Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0.
magnet:?xt=urn:btih:441da1af7a16bcaa4f556964f8028d7113d21cbb&dn=weights&tr=udp://tracker.opentrackr.org:1337/announce
What if video AI finally felt real?
For years, we’ve seen “AI video” that looked more like animated slideshows — stilted motion, broken physics, and prompts that miss the mark.
That just changed.
Mochi 1 from @genmoai is an open-source breakthrough in video generation — and it
Mochi from @genmoai is back with an updated model.
They called it a small update, but I think it's huge! ✨
The video quality is excellent, with natural, realistic movement.
The barista's milk pour is 99% accurate to real latte art, and the foam's movement, pushing the crema
Do more with Mochi 🎬
We've pushed a small update to Mochi; allowing your creativity to flourish with this tiny tune up!
Post your new generations in the quotes or comments 👀 We love to see them!
Mochi from @genmoai is back with an updated model.
They called it a small update, but I think it's huge! ✨
The video quality is excellent, with natural, realistic movement.
The barista's milk pour is 99% accurate to real latte art, and the foam's movement, pushing the crema
SURPRISE! GENMO Mochi is back y'all!
A big improvement over their older model.
bright social media video, roller coaster car that says "MOCHI" plunging down a steep drop, then whipping around sharp turns, with two attractive adult female women in intricate outfits and their
Do more with Mochi 🎬
We've pushed a small update to Mochi; allowing your creativity to flourish with this tiny tune up!
Post your new generations in the quotes or comments 👀 We love to see them!
Do more with Mochi 🎬
We've pushed a small update to Mochi; allowing your creativity to flourish with this tiny tune up!
Post your new generations in the quotes or comments 👀 We love to see them!
fine-grained editing of videos is hard. if I use a Video Diffusion Transformer to make my videos, just adding "red" to the prompt totally changes the video. in our new paper, we dive deep into the attention maps of VDiTs and find a way to do fine-grained editing, and other stuff!
Text-to-video models are silent🔇, but does that mean they don't know music, beat, and tempo🎶?
I'm excited to present MusicInfuser🎹, an adapter network which aligns silent dancing videos to music.
Check out our paper, examples, code, and weights here: susunghong.github.io/MusicInfuser