Projects
Competition finishes
π 1st place at the iMaterialist Challenge (Fashion) at FGVC5 (Kaggle)
π₯ 2nd place Looks Like Grain competition (Unearthed)
π 5th place, Google AI Open Images - Visual Relationship Track (Kaggle solo gold)
Podcast interviews
π οΈChai Time Data Science
ποΈ ML Engineered
Publications
- Nemotron ColEmbed V2: Top-Performing Late Interaction embedding models for Visual Document Retrieval (Gabriel de Souza P. Moreira, Ronay Ak, Mengyao Xu, Oliver Holworthy, Benedikt Schifferer, Zhiding Yu, Yauhen Babakhin, Radek Osmulski, Jiarui Cai, Ryan Chesler, Bo Liu, Even Oldridge)
- Llama-Embed-Nemotron-8B: A Universal Text Embedding Model for Multilingual and Cross-Lingual Tasks (Yauhen Babakhin, Radek Osmulski, Ronay Ak, Gabriel Moreira, Mengyao Xu, Benedikt Schifferer, Bo Liu, Even Oldridge)
- Omni-Embed-Nemotron: A Unified Multimodal Retrieval Model for Text, Image, Audio, and Video (Mengyao Xu, Wenfei Zhou, Yauhen Babakhin, Gabriel Moreira, Ronay Ak, Radek Osmulski, Bo Liu, Even Oldridge, Benedikt Schifferer)
- Llama Nemoretriever Colembed: Top-Performing Text-Image Retrieval Model (Mengyao Xu, Gabriel Moreira, Ronay Ak, Radek Osmulski, Yauhen Babakhin, Zhiding Yu, Benedikt Schifferer, Even Oldridge)
- MIRACL-VISION: A Large, multilingual, visual document retrieval benchmark (Radek Osmulski, Gabriel de Souza P. Moreira, Ronay Ak, Mengyao Xu, Benedikt Schifferer, Even Oldridge)
- Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG (Gabriel de Souza P. Moreira, Ronay Ak, Benedikt Schifferer, Mengyao Xu, Radek Osmulski, Even Oldridge)
- NV-Retriever: Improving text embedding models with effective hard-negative mining (Gabriel de Souza P. Moreira, Radek Osmulski, Mengyao Xu, Ronay Ak, Benedikt Schifferer, Even Oldridge)
Other projects
- [tutorial] Humpback Whale Identification Competition Starter Pack
- [pretrained weights] yolov3 weights trained on the Open Images dataset
- [web application] aiquizzes.com β a free online application for learning machine learning concepts
- [tutorial] Digital Signal Processing with Python: A Hands-on Introduction
- [tutorial] How to troubleshoot in Jupyter notebooks
- [tutorial] Quick, Draw! Kaggle Competition Starter Pack v2
- [learning materials] Anki cards (digital index cards) for the fast.ai v4 part 1 course
- [code] Implementation of 10 of Geoffrey Hinton's papers in numpy
- [article] Overview of my solution to "Google AI Open Images - Visual Relationship Track" challenge
- [article] Observations from diving into tensorflow