Cover Image for Strange Evals - VLMs
Cover Image for Strange Evals - VLMs
24 Went

Strange Evals - VLMs

Hosted by Parth A. Patel & 4 others
Register to See Address
San Francisco, United States
Registration
Past Event
Welcome! To join the event, please register below.
About Event

a paper reading club but we deep dive into benchmarks

----

Each session we dive deep on a related battery of widely cited benchmarks so we can develop an intuition about what we're up against. This includes reading papers, but also eyeballing raw benchmark samples (a surprising number of them don't make any sense).

This week we are discussing the big VLM benchmarks to find out why almost all of them are already fully saturated.

---

Pre-reading:

Read the ZeroBench paper: https://zerobench.github.io

OR

The MMMU Pro paper: https://arxiv.org/abs/2409.02813

OR

Spend 30 minutes eyeballing the samples from some of these benchmarks https://github.com/Lewington-pitsos/strange-vlm and come up with something interesting.

---

Special thanks to HUD for hosting us: https://www.hud.ai

Attendance will be limited to keep discussion focused.

Food web image source:
MLA (9th edition): Kembhavi, Aniruddha, et al. "A Diagram Is Worth A Dozen Images." Computer Vision – ECCV 2016, Springer, 2016, pp. 235–251.

Location
Please register to see the exact location of this event.
San Francisco, United States
24 Went