user avatar
Xiaolong Wang
@xiaolonw
Research Director, @Meta Superintelligence Labs Co-founder of ARI Associate Professor @UCSDJacobs Postdoc @berkeley_ai PhD @CMU_Robotics
San Diego, CA
Joined March 2016
Posts
  • Pinned
    user avatar
    Excited to share that Assured Robot Intelligence (ARI) has joined @Meta to help build the future of humanoid intelligence! When we started ARI one year ago, our mission was clear: achieve physical AGI. Through deep customer engagements and real-world deployments, it became clear
    Meta Platforms Inc. has acquired Assured Robot Intelligence, a startup developing artificial intelligence models for robots, as part of a major initiative to build humanoid technology. bloomberg.com/news/articles/…
  • user avatar
    Cannot believe this finally happened! Over the last 1.5 years, we have been developing a new LLM architecture, with linear complexity and expressive hidden states, for long-context modeling. The following plots show our model trained from Books scale better (from 125M to 1.3B)
  • user avatar
    Let’s think about humanoid robots outside carrying the box. How about having the humanoid come out the door, interact with humans, and even dance? Introducing Expressive Whole-Body Control for Humanoid Robots: expressive-humanoid.github.io See how our robot performs rich, diverse,
    00:00
  • user avatar
    Test-Time Training (TTT) is now on Video! And not just a 5-second video. We can generate a full 1-min video! TTT module is an RNN module that provides an explicit and efficient memory mechanism. It models the hidden state of an RNN with a machine learning model, which is updated
    00:00
  • user avatar
    Got my tenure! Very grateful to my students and collaborators.
  • user avatar
    Stable Diffusion generates beautiful images, but can it be used for open-world recognition? Try Demo! huggingface.co/spaces/xvjiaru… Our #CVPR2023 paper shows that the pre-trained diffusion model indeed is a good image parser, allows for open-vocabulary segmentation and detection.
    00:00
  • user avatar
    Is 3D scene generation much closer to being solved all of a sudden? It has been a few days since the release of @OpenAI Sora. We run our COLMAP-Free 3D Gaussian Splatting on the released videos. Our method does not need to pre-process cameras and it seems we can directly just get
    00:00
  • user avatar
    I have been cleaning my daughter's mess for more than two years now. Last weekend our robot came to home to do the job for me. 🤖 wholebody-b1.github.io Our new work on visual whole-body control learns a policy to coordinate the robot legs and arms for mobile manipulation. See
    00:00
  • user avatar
    Introducing #CVPR2022 GroupViT: Semantic Segmentation Emerges from Text Supervision 👨‍👩‍👧 jerryxu.net/GroupViT/ Without any pixel label ever, Our Grouping ViT can group pixels bottom-up to open vocabulary semantic segments. The only training data is 30M noisy image-text pairs.
    00:00
  • user avatar
    Very sad to see this. Just reminds me some random dude in the bay drove by and shouted to me “go back to your own country”. Racism real. Very sad to see the language in NeurIPS announcement and the actions of different organizations after are more like trying to tune this whole
    Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡
  • user avatar
    Our humanoid dancing outside #GTC2024
    00:00
  • user avatar
    New work with Yinbo Chen, one of my first PhD students: Learning Continuous Image Representation with Local Implicit Image Function. Check our video showing images in arbitrary resolutions. proj: yinboc.github.io/liif/ code: github.com/yinboc/liif @YinboChen @SifeiL (1/n)
    00:00
  • user avatar
    Tesla Optimus can arrange batteries in their factories, ours can do skincare (on @QinYuzhe)! We opensource Bunny-VisionPro, a teleoperation system for bimanual hand manipulation. The users can control the robot hands in real time using VisionPro, flexible like a bunny. 🐇
    00:00
  • user avatar
    Introducing Category-Level 6D Object Pose Estimation in the Wild.🏞️ We release Wild6D: an in-the-wild object-centric RGBD dataset with 5000+videos over 1700+objects. We perform semi-supervised 6D object pose estimation on it without manual annotations. oasisyang.github.io/semi-pose/
    00:00