MBZ (@babaeizadeh) / X

MBZ

1,368 posts

MBZ

@babaeizadeh

Senior Staff Research Scientist at @GoogleDeepMind Gemini Omni, Veo3, Veo2, Veo, Phenaki

Mountain View, CA

Joined June 2009

Pinned
MBZ
@babaeizadeh
May 20
how many edits is enough? 😅 #Gemini #Omni Flash see thread for editing steps
00:00
Made with AI
810
MBZ
@babaeizadeh
May 20, 2025
#Veo3 further blurs the lines between reality and imagination with audio, stronger text adherence, and richer visual details.
00:00
740K
MBZ
@babaeizadeh
Dec 16, 2024
How good is #Veo2? Let's look at some samples. "a sitcom tv show about potatoes" #Veo
00:00
253K
MBZ
@babaeizadeh
Dec 18, 2024
#Veo2 excels at generating videos that feel remarkably "real". Dynamic living backgrounds + fluid motion + finely rendered details in faces, hands, and bodies, that creates truly natural looking videos. Here is some videos with "mundane life" as prompt. no cherry picking. #veo
00:00
43K
MBZ
@babaeizadeh
Dec 10, 2020
Is predicting future rewards sufficient for achieving success in visual model-based reinforcement learning? We experimentally demonstrate that this is usually *not* the case in the online settings and the key is to predict future images too. 1/5
GIF
MBZ
@babaeizadeh
Dec 17, 2024
Replying to @babaeizadeh
many people asked for Anime. but this is a potato thread. so here we go "anime style footage of two potatoes having a sword fight. cinematic, fastpaced with a lot of shotcuts"
00:00
54K
MBZ
@babaeizadeh
Oct 5, 2022
It's hard to promote your work when your alma mater is under siege, but here we go. Introducing Phenaki, a model that can generate minutes of videos given a story. Hopefully, it will be used for some good somewhere.
Dumitru Erhan
@doomie
Oct 5, 2022
1/ Today we are excited to introduce Phenaki: phenaki.github.io, short-link-to-paper, a model for generating videos from text, with prompts that can change over time, and that is able to generate videos that can be as long as multiple minutes!
GIF
MBZ
@babaeizadeh
Dec 16, 2024
Replying to @babaeizadeh
"Cinematic fast paced shot of a muscle sport car drifting around a corner. It's evening. The headlights of the car is cutting through the heavy fog. The license plate is "Veo". The car is moving so fast the the pedestrians are blurry. The driver is a potato."
00:00
8.4K
MBZ
@babaeizadeh
Dec 17, 2024
Replying to @babaeizadeh
Since chippings are viral now! 😅
00:00
10K
MBZ
@babaeizadeh
Feb 3, 2021
Blog post on our latest experiments on visual model based reinforcement learning arxiv.org/abs/2012.04603 One of the most stable and flexible libraries that I ever worked on github.com/google-researc… with @msaffar3 @danijarh @harinidkannan @chelseabfinn @svlevine @doomie
Google AI
@GoogleAI
Feb 3, 2021
Introducing the World Models Library, an open-source, platform-agnostic suite of tasks and tools for examination of world model design and performance in visual model-based reinforcement learning. Learn more and grab the code at goo.gle/36EXY1t
GIF
MBZ
@babaeizadeh
Dec 16, 2024
Replying to @babaeizadeh
"a high energy music video. the singer is a potato and the dancers are other vegetables."
00:00
10K
MBZ
@babaeizadeh
Dec 17, 2024
Replying to @babaeizadeh
"a documentary about the famous potato shaped casino being built in Las Vegas blvd"
00:00
3.9K
MBZ
@babaeizadeh
Jun 25, 2021
Introducing FitVid, a variational video prediction model, which is capable of severe overfitting on the common video prediction benchmarks -- while having similar parameter count as the current sota models. with @msaffar3 @SurajNair_1 @svlevine @chelseabfinn @doomie
GIF
MBZ
@babaeizadeh
Dec 16, 2024
Replying to @babaeizadeh
"a training montage of a potato training hard for Olympics."
00:00
9.4K