Shawn Lewis (@shawnup) / X

Shawn Lewis

645 posts

Shawn Lewis

@shawnup

Founder & CTO @wandb. Building tools for AI. Going even bigger @CoreWeave.

Joined March 2011

Pinned
Shawn Lewis
@shawnup
Jan 16, 2025
My o1-based AI programming agent is now state of the art on SWE-Bench Verified! It resolves 64.6% of issues. This is the first fully o1-driven agent we know of. And we learned a ton building it.
210K
Shawn Lewis
@shawnup
Jan 20, 2025
Our SWE-Bench submission has been accepted and is officially SOTA! Thanks SWE-Bench team for making such an important benchmark.
50K
Shawn Lewis
@shawnup
Jan 16, 2025
Replying to @shawnup
How it works: • o1 with reasoning_mode high for all agent step and editing logic • a gpt4o based memory component that compresses the agent’s step history • a custom built python code editor toolset designed to efficiently use model context • the ability to register
12K
Shawn Lewis
@shawnup
Jan 13, 2025
New result for my pure o1-based agent: 57.4% pass@1 on SWEBench-Verified! Avg cost: $7.5 per instance Avg time: 13.5 minutes per instance Pass@3 is 67.8%. Now I'm working on "test time compute scaling", ie combining/choosing the best trajectories, to push closer to this mark.
14K
Shawn Lewis
@shawnup
Apr 4, 2024
I'm very excited to announce Weave, our new tools to track and evaluate your LLM apps. Use Weave to: 🍩log and version LLM interactions and surrounding data, from development to production 🍩experiment with prompting techniques, model changes, and parameters 🍩evaluate your
64K
Shawn Lewis
@shawnup
Jun 24, 2022
Coming soon to a notebook near you: This is our Table visualizer, powered by a new technology for building composable applications called Weave.
00:00
Shawn Lewis
@shawnup
Jan 16, 2025
Replying to @shawnup
Here's my writeup on the solution and how we did it: medium.com/@shawnup/the-b…. Read for o1 tips and lots of other nuggets.
Creating a state-of-the-art AI programming agent with OpenAI’s o1
From medium.com
11K
Shawn Lewis
@shawnup
Aug 13, 2024
Replying to @iruletheworldmo
You said: "attention isn't all you need, new architecture announcement, august 13th @ 10am pt the singularity begins"
6.4K
Shawn Lewis
@shawnup
Jan 16, 2025
Replying to @shawnup
o1 is a different beast. Its better at doing exactly what you say. Its better at solving hard coding problems. And the advice others have given to specify the outcome you want and give it room to operate is spot on.
9.4K
Shawn Lewis
@shawnup
Aug 13, 2024
Replying to @iruletheworldmo
For readers, there were just more than 2000 people in a Twitter space for 1 hour, with @iruletheworldmo promising to speak, many well-respected folks in the space. 🍓 did not speak. Conclusion: do not waste your time.
1.5K
Shawn Lewis
@shawnup
Jul 19, 2021
We’ve been hard at work building Tables, a new way to organize, understand and improve your data. Today we're opening it up to everyone! Try it here:
wandb.ai
Announcing W&B Tables: Iterate on Your Data
Today, we're excited to launch W&B Tables, a new tool for data iteration and model evaluation. Here's how it works:. Made by Shawn Lewis using W&B
Shawn Lewis
@shawnup
Jan 16, 2025
Replying to @shawnup
And I built a new typescript-based agent framework called phaseshift that's deeply integrated with Weave. I'm excited to polish it up and release it to the world!
7.9K
Shawn Lewis
@shawnup
Aug 13, 2024
Replying to @iruletheworldmo and @ChatGPTapp
here's your receipt: "attention isn't all you need new architecture announcement august 13th @ 10am pt the singularity begins"
3.8K
Shawn Lewis
@shawnup
Mar 5, 2025
I’m incredibly proud of everything our team at @wandb has accomplished, and excited to keep building with the amazing folks from @CoreWeave!
Weights & Biases
@wandb
Mar 5, 2025
Today we announced that we are being acquired by @CoreWeave, the AI Hyperscaler. 🪄🐝 We could not be prouder or more excited to join forces with this team. Our CEO, @l2k, wrote a blog post with more details: wandb.ai/wandb/wb-annou…
8.4K