Ivan Leo (@ivanleomk) / X

Ivan Leo

8,470 posts

Ivan Leo

@ivanleomk

Developer Experience @GoogleDeepMind prev @manusai. I write at ivanleo.com. Tweets are my own views!

San Francisco, CA

Joined March 2011

Ivan Leo
@ivanleomk
Dec 1, 2024
You can beat OpenAI embeddings with just 200 examples. We proved this with bge-base-en earlier this year, but bge-M3 takes it even further. It reaches the same peak performance as BGE-base with ~70,000 fewer examples (70% less data) plus better performance out of the box. With
Beating proprietary models with a quick fine-tune
From modal.com
71K
Ivan Leo
@ivanleomk
Nov 15, 2024
Taught gemini how to highlight PDF text in a document :) Dropping in a bit in our docs
80K
Ivan Leo
@ivanleomk
Jan 5, 2025
Ever struggled to understand how users use your product? I just built an open source implementation of Anthropic's internal clustering algorithm - CLIO. With Gemini Flash, you can generate human readable labels which are clustered and grouped together to spot usage patterns.
00:00
83K
Ivan Leo
@ivanleomk
Dec 4, 2024
When I started doing LLM evaluations, I kept running into the same question - how do I know that my 2% improvement was meaningful. Anthropic's recent paper is spot on - when working with large language models, we need to factor in uncertainty into our benchmarks. This isn't
30K
Ivan Leo
@ivanleomk
Jan 20, 2024
I've had some people ask me for advice on ML so I compiled some of my thoughts into 3 tips that I wish I had thought of/done more of when I first started I'm honestly still a beginner but I figured I'd offer some thoughts as to how someone can juggle this with a fulltime job
128K
Ivan Leo
@ivanleomk
Dec 9, 2024
We improved our LLM's recall from 0.86 to 1.0 with a single sentence added to the prompt. By looking at our failure cases, we found that our model was being overly specific with the categories we were applying. For instance, if users asked for denim bottoms, it would
45K
Ivan Leo
@ivanleomk
Jul 28, 2025
Life update : got a new perm and joined @manusai to build general agents
14K
Ivan Leo
@ivanleomk
Oct 21, 2024
Structured Outputs with audio files has never been easier. Just define a pydantic model, use the Audio object to read it in and you're good to go
17K
Ivan Leo
@ivanleomk
Nov 13, 2023
Literally got into machine learning because of @jeremyphoward's FastAI course a few months ago. This is surreal. This just made my entire week.
31K
Ivan Leo
@ivanleomk
Sep 23, 2024
Using whisper is so 2023. Just use gemini, pass in the raw audio directly and prompt the model directly with all the questions you have. With instructor, we can get - The exact mispronounced word - The timestamp when we did it - Advice on how to do better Flash truly is the
00:00
90K
Ivan Leo
@ivanleomk
Apr 6, 2025
For those interested in llama 4 being multimodal, I would like to point out @rasbt ‘s great walkthrough on multimodal LLMs again
Understanding Multimodal LLMs
From magazine.sebastianraschka.com
13K
Ivan Leo
@ivanleomk
Mar 16, 2025
I got @willccbb's script to work on @modal_labs for GRPO! This is a minimal version that will run on a single GPU, for some reason the trainer doesn't seem to work for me nicely when I have multiple GPUs. But a good first step :)
modal-grpo/grpo_simple.py at main · ivanleomk/modal-grpo
From github.com
10K
Ivan Leo
@ivanleomk
Dec 30, 2024
One of the most interesting discoveries in the past few days has been musicgen-small ( large if u have gpu). I've been using it to generate short snippets that are honestly... not too bad?
facebook/musicgen-small · Hugging Face
From huggingface.co
20K
Ivan Leo
@ivanleomk
Jul 22, 2024
Save 50% on your OpenAI Bill ( even on fine-tuned models) with batch jobs using Instructor Just use our new BatchJobs object :)
00:00
13K