Jimmy Lin (@lintool) / X

Jimmy Lin

4,472 posts

Jimmy Lin

@lintool

I profess CS-ly at @UWaterloo. Previously, I monkeyed code for @Twitter, slides for @Cloudera, and scienced for @yupp_ai.

Nearby data lake

cs.uwaterloo.ca/~jimmylin/

Joined February 2010

Jimmy Lin
@lintool
Aug 8, 2020
Reviewers automatically assume that simple is not novel. This is sheer laziness. Yes, it may be simple and obvious in retrospect, but someone had to have that insight first. Simple is good. Simple is robust, easy to implement and reproduce, broadly applicable, etc.
Jimmy Lin
@lintool
Jul 11, 2025
It’s been 36 hours since Grok 4 launched and we have an early verdict based on 6K+ preferences of @yupp_ai users globally on real use cases. ‼️ Grok 4 is worse than other leading models: OpenAI o3, Claude Opus 4, and Gemini 2.5 Pro. Grok 4 is liked even less than Grok 3. 🧵
328K
Jimmy Lin
@lintool
Oct 11, 2022
DAAM... You saw it here first! Attribution maps for Stable Diffusion based on upscaling and aggregating cross-attention activations in the latent denoising subnetwork: arxiv.org/abs/2210.04885 example for "an angry, bald man doing research" below - demo at daam.ralphtang.com:8080
Jimmy Lin
@lintool
Mar 20, 2023
GPT-4 and its ilk are awesome for rapid prototyping and one-offs, but at the end of the day, enterprises will deploy far smaller distilled models in production. Here's my contrarian take -
207K
Jimmy Lin
@lintool
May 19, 2021
So, CV researchers are looking at transformers and NLP researchers are looking at CNNs (again). What a strange world.
Jimmy Lin
@lintool
Jul 21, 2020
Still cropping and modifying BERT diagrams from Devlin et al. (2019)? I spent several hours redrawing BERT in PowerPoint so you don't have to... Perfect for use in presentations, papers, etc.! Happy to share under Releasing under CC BY 4.0 cs.uwaterloo.ca/~jimmylin/BERT…
Jimmy Lin
@lintool
Feb 20, 2018
Following the AI Residency program by Google, Facebook, Microsoft, Uber, etc., I'd like to start the Waterloo AI Residency program. It's called grad school.
Jimmy Lin
@lintool
Nov 24, 2021
Look what came in the mail!
Jimmy Lin
@lintool
Jul 27, 2020
"NLP makes IR interesting and IR makes NLP useful!" - slides from my #sigir2020 summer school talk at: cs.uwaterloo.ca/~jimmylin/publ… Get your rotten tomatoes and eggs out!
Jimmy Lin
@lintool
Oct 11, 2021
BERT is three years old today!
arxiv.org
BERT: Pre-training of Deep Bidirectional Transformers for Language...
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is...
Jimmy Lin
@lintool
Oct 14, 2020
Happy to share an early draft of "Pretrained Transformers for Text Ranking: BERT and Beyond", our forthcoming book (tentatively, early 2021) by @lintool @rodrigfnogueira @andrewyates
arxiv.org
Pretrained Transformers for Text Ranking: BERT and Beyond
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the...
Jimmy Lin
@lintool
Sep 11, 2024
For vector search, practitioners kinda know that for small corpora, don't bother with HNSW indexing, just brute-force it. However, guidance is mostly hand wavy... until now. I ran some experiments for you on BEIR and wrote it up. arxiv.org/abs/2409.06464 You're welcome.
25K
Jimmy Lin
@lintool
Feb 8, 2023
My (contrarian?) take: prompt engineering is programming in natural language. We've tried this before, with attempts dating back decades. Recent advances do not change the fact that natural languages are ambiguous, imprecise, under-specified, highly contextual, etc.
33K
Jimmy Lin
@lintool
Jan 16, 2023
Recently, @CohereAI boasted "3X better performance" in multilingual text understanding. We tested that claim by evaluating Cohere embeddings on MIRACL: tl;dr - We weren't able to replicate the 3X claim, but we did observe a 38% improvement over BM25. github.com/castorini/pyse…
62K