Sieve

Sieve · 2025-06-10T17:44:14.795Z

Transparency in how AI capabilities work and the way in which they're evaluated help developers build trust in using them. At Sieve, we're constantly developing internal evaluation systems that enable to us to ship the highest quality AI video capabilities in the world. Tomorrow, we'll be sharing a behind-the-scenes look at how we evaluate one of our most nuanced solutions and the specific optimizations we made to score highly.

Software Development

San Francisco, CA 3,149 followers

Video datasets for frontier AI.

See jobs Follow

Discover all 29 employees

About us

Sieve is the only AI research lab exclusively focused on video data. Video already makes up 80% of internet traffic and has become the dominant medium driving creativity, communication, gaming, AR/VR, and robotics. Unlocking the ability to truly model video is the key to breakthroughs across all of these domains but progress has been bottlenecked by one thing: high-quality training data. That’s where Sieve comes in. We bring together exabyte-scale video infrastructure, novel video understanding techniques, and dozens of diverse data sources to create datasets that push the frontier of video modeling. This unique combination allows us to deliver data with unmatched precision, quality, and speed which has earned the trust of frontier AI labs, Fortune 100 companies, and fast-growing generative AI startups.

Website: https://sievedata.com/
External link for Sieve
Industry: Software Development
Company size: 11-50 employees
Headquarters: San Francisco, CA
Type: Privately Held
Founded: 2022

Locations

Primary

San Francisco, CA, US

Get directions

Employees at Sieve

See all employees

Updates

Sieve

3,149 followers
1w
Report this post
We just hosted an amazing multimodal AI lunch at Y Combinator HQ, bringing together tons of multimodal AI researchers across many different frontier AI labs. We talked about everything from benchmarking creativity to robot policies to computer use agents. Thanks to Diana Hu, Audrey Pe, and Arthita Ghosh for making it a successful event. If you're a researcher or builder working in the multimodal space, drop us a DM or send us message here: https://sieve.ai/contact. We'd love to chat and invite you to the next one!
Diana Hu
1w

we hosted a multimodal researcher lunch at YC with the Sieve team, who've been doing really good work in multimodal data we got researchers from several frontier labs in a room together and the conversation ranged from robotics to world models to computer use agents. more of these coming soon!
2 Comments

Like Comment Share
Sieve

3,149 followers
1mo
Report this post
Sieve hosted a multimodal brunch at Copra SF last Sunday bringing together a small group of researchers, engineers, and builders all deep in the multimodal space. Great people, great food, and some of the best back-and-forth we've had! Thanks Arthita Ghosh, Auriel W., Ishan Dhawan for organizing this, and thanks to everyone who joined us and helped make it such a huge success. If you're a researcher or builder working in the multimodal space, drop us a DM or send a message here: https://sieve.ai/contact. We'd love to chat and hope to see you at the next one!
6 Comments

Like Comment Share
Sieve

3,149 followers
2mo
Report this post
This Wednesday, we’re hosting our first Multimodal Dinner at Berkeley with Launchpad (UC Berkeley). If you’re building multimodal systems (research or production) or thinking deeply about multimodal data, come meet others working at the frontier and swap notes on what’s working, what’s broken, and what’s next. If you’re in the Bay Area and want to join, comment or DM for details/RSVP.
4 Comments

Like Comment Share
Sieve reposted this
Mokshith Voodarla
4mo
Report this post
Sieve is headed to NeurIPS next week! If you're around, I'd love to meet and chat about video models and where they're headed in the next year. Our team spends all of our time thinking about 1) how to create, aggregate, or acquire as much useful raw video as possible and 2) how to build the best possible search and understanding systems on top of all that raw content to deliver the highest-quality video datasets in the world at petabyte-scale (we've processed hundreds of PBs in the last few months!). Shoot me a message or hit us up at neurips@sievedata.com if you want to meet! Ai Ishiguro Gaurang Bharti
Like Comment Share
Sieve reposted this
Mokshith Voodarla
9mo Edited
Report this post
Simulation and human videos are an exciting data well for robot world models due to the particular difficulty of collecting real-world data, but Sergey Levine (UC Berkeley professor + co-founder of Physical Intelligence) just published an incredibly sobering take on the danger of venturing too deeply in this direction. Some researchers have gotten so excited in other data sources to the extent that they're being treated as a replacement to the real thing. But just like how LLMs use lots of text data and VLMs use text-image pairs, VLAs (vision-language-action) models in robotics need a lot of data of robots performing real-world tasks. Instead of treating simulation or human video (i.e. FPV videos posted online) as a complete replacement, we should treat it the same way we treat internet data in LLM and VLM pre-training - something less relevant to the ultimate goals of the model but still relevant enough to provide useful world knowledge. At Sieve, we're excited to be contributing to this problem area through our early work with robotics labs making use human videos for VLA pre-training. If you're interested in learning more about Sergey's take or our human video offering, check out the links in comments.
4 Comments

Like Comment Share
Sieve reposted this
Mokshith Voodarla
9mo
Report this post
The last two months have been insane. I randomly came into the office this morning and decided to record this video on why there has literally never been a more exciting time to join Sieve. We're working with leading research teams pushing the frontier of creative, robotics, VR, gaming, and so much more. It gets me giddy thinking about the fact that we get to work with such teams given how I got into a lot of this stuff doing robotics in high school. If you’re an engineer that wants to push the frontier of these industries and finds the technical challenges around internet-scale video processing interesting, please reach out! You can learn more about our work through the document linked in comments and check out our open roles.

10 Comments

Like Comment Share
Sieve reposted this
Mokshith Voodarla
10mo
Report this post
Stoked to welcome Ahi to the team at Sieve. Ahi's background is unique - from building the world's smallest batteries to training high quality TTS and diffusion models in the Computational Image Group at Rice. If you're interested in joining a fast-growing team working on internet-scale computer vision problems, DM me :)
Ahitagni D

ahidas.com
10mo

Just completed week 1 at Sieve, and the energy is unreal. I'm working on the applied ML team, shipping next-gen video understanding APIs, that go from whiteboard to production in weeks. Massive thanks to Mokshith & Abhinav for bringing me onboard, and to Jacob for the mentorship that’s already next level. Stoked for what’s ahead!
Like Comment Share
Sieve reposted this
Mokshith Voodarla
10mo Edited
Report this post
Today we’re launching “The Dubbing Rubric” — a proven method for evaluating AI dubbing systems. AI dubbing can be complex to evaluate because of how nuanced it is and we’ve worked with many teams that get hung up on figuring out the right process. Our rubric showcases important categories to evaluate from linguistics to speech & voice, timing, audio quality, and multi-speaker handling. This is a human evaluation rubric since judging naturalness is difficult via older standards (BLEU scores). We use these categories internally to evaluate our system, and now we’re sharing that methodology with the world. We’re also releasing human eval results where 10 native speakers of each language rated various well-known solutions based on this rubric. We plan to increase the number of benchmarked providers and release de-anonymized results in the coming weeks. Check out the website to explore results yourself: https://lnkd.in/gfp56vG9 Check out the detailed blog post: https://lnkd.in/gMzk2bFi Amazing work by Ahmed Hanzala for spearheading this effort.

11 Comments

Like Comment Share
Sieve

3,149 followers
10mo
Report this post
Transparency in how AI capabilities work and the way in which they're evaluated help developers build trust in using them. At Sieve, we're constantly developing internal evaluation systems that enable to us to ship the highest quality AI video capabilities in the world. Tomorrow, we'll be sharing a behind-the-scenes look at how we evaluate one of our most nuanced solutions and the specific optimizations we made to score highly.
Like Comment Share
Sieve reposted this
Joséphine Parquet
10mo Edited
Report this post
Last week end, I showed up at a hackathon for the free pizza (okay, maybe a bit more than that). I left with a trophy, mild sleep deprivation, and three convictions about building products: - A lovable product beats a technically perfect one - With limited time, a bug can become your best feature - Using your own app obsessively might be market research… or self-delusion. TBD Alright, now the story: I teamed up with George Profenza and Daniel Jiang to build MoodBomb: an app that turns your selfies into very fun and creative lip-synced video messages. We used APIs from fal, VEED.IO, Sieve, and ElevenLabs, had an absolute blast... and ended up winning first place! Huge thanks to the VEED.IO team for hosting such a fun and well-run event (Grace Greer, Ivelina Stamenova!), to the sponsors Sieve, fal, ElevenLabs, and Photoroom for the support and generous prizes, and to the amazing judges Sabba Keynejad, Abhinav A., Timur Mamedov! Grateful to have built alongside such a sharp and creative group. Can’t wait to see what comes next, for MoodBomb and for everyone else!
25 Comments

Like Comment Share

Browse jobs

Funding

Sieve 2 total rounds

Last Round

Seed Dec 14, 2022

US$ 4.0M

Investors

Matrix + 10 Other investors

See more info on crunchbase

Sieve

Software Development

San Francisco, CA 3,149 followers

Video datasets for frontier AI.

About us

Locations

Employees at Sieve

Paul Scheid

Arthita Ghosh

Abhinav A.

Mokshith Voodarla

Updates

Join now to see what you are missing

Similar pages

sieve (YC X25)

Giga

Icon

sync.

toona

Mintlify

Taama

Mercor

Scale AI

VEED

Browse jobs

Engineer jobs

Product Designer jobs

Machine Learning Engineer jobs

Software Engineer jobs

Principal Engineer jobs

Full Stack Engineer jobs

Scientist jobs

Developer jobs

User Experience Designer jobs

Technical Producer jobs

Software Architect jobs

C Developer jobs

Intelligence Specialist jobs

Senior Product Manager jobs

Embedded Software Engineer jobs

Mechanical Engineer jobs

Director jobs

Project Manager jobs

Manager jobs

Quantitative Developer jobs

Funding