bench: Add a benchmark for vLM: MMMU#3562
Conversation
e5f7884 to
21b3a5b
Compare
772ce3b to
67f81f1
Compare
b88293b to
6d44c37
Compare
There was a problem hiding this comment.
Hi @mickqian, I leave some comments, all of them are related to path. I also try to run bench_hf.py, but it seems to cause OOM, I am not sure if it is normal.
Here is qwen2vl and qwen2.5vl results in my env:
qwen2vl
{"Overall-Art and Design": {"num": 120, "acc": 0.317}, "Art": {"num": 30, "acc": 0.4}, "Art_Theory": {"num": 30, "acc": 0.367}, "Design": {"num": 30, "acc": 0.3}, "Music": {"num": 30, "acc": 0.2}, "Overall-Business": {"num": 150, "acc": 0.32}, "Accounting": {"num": 30, "acc": 0.333}, "Economics": {"num": 30, "acc": 0.3}, "Finance": {"num": 30, "acc": 0.2}, "Manage": {"num": 30, "acc": 0.267}, "Marketing": {"num": 30, "acc": 0.5}, "Overall-Science": {"num": 150, "acc": 0.333}, "Biology": {"num": 30, "acc": 0.367}, "Chemistry": {"num": 30, "acc": 0.167}, "Geography": {"num": 30, "acc": 0.333}, "Math": {"num": 30, "acc": 0.433}, "Physics": {"num": 30, "acc": 0.367}, "Overall-Health and Medicine": {"num": 150, "acc": 0.38}, "Basic_Medical_Science": {"num": 30, "acc": 0.433}, "Clinical_Medicine": {"num": 30, "acc": 0.433}, "Diagnostics_and_Laboratory_Medicine": {"num": 30, "acc": 0.133}, "Pharmacy": {"num": 30, "acc": 0.567}, "Public_Health": {"num": 30, "acc": 0.333}, "Overall-Humanities and Social Science": {"num": 120, "acc": 0.35}, "History": {"num": 30, "acc": 0.367}, "Literature": {"num": 30, "acc": 0.367}, "Sociology": {"num": 30, "acc": 0.267}, "Psychology": {"num": 30, "acc": 0.4}, "Overall-Tech and Engineering": {"num": 210, "acc": 0.267}, "Agriculture": {"num": 30, "acc": 0.233}, "Architecture_and_Engineering": {"num": 30, "acc": 0.3}, "Computer_Science": {"num": 30, "acc": 0.333}, "Electronics": {"num": 30, "acc": 0.167}, "Energy_and_Power": {"num": 30, "acc": 0.267}, "Materials": {"num": 30, "acc": 0.367}, "Mechanical_Engineering": {"num": 30, "acc": 0.2}, "Overall": {"num": 900, "acc": 0.323}}
qwen2.5vl
{"Overall-Art and Design": {"num": 120, "acc": 0.242}, "Art": {"num": 30, "acc": 0.2}, "Art_Theory": {"num": 30, "acc": 0.267}, "Design": {"num": 30, "acc": 0.3}, "Music": {"num": 30, "acc": 0.2}, "Overall-Business": {"num": 150, "acc": 0.3}, "Accounting": {"num": 30, "acc": 0.467}, "Economics": {"num": 30, "acc": 0.333}, "Finance": {"num": 30, "acc": 0.1}, "Manage": {"num": 30, "acc": 0.233}, "Marketing": {"num": 30, "acc": 0.367}, "Overall-Science": {"num": 150, "acc": 0.2}, "Biology": {"num": 30, "acc": 0.133}, "Chemistry": {"num": 30, "acc": 0.133}, "Geography": {"num": 30, "acc": 0.2}, "Math": {"num": 30, "acc": 0.3}, "Physics": {"num": 30, "acc": 0.233}, "Overall-Health and Medicine": {"num": 150, "acc": 0.267}, "Basic_Medical_Science": {"num": 30, "acc": 0.233}, "Clinical_Medicine": {"num": 30, "acc": 0.167}, "Diagnostics_and_Laboratory_Medicine": {"num": 30, "acc": 0.2}, "Pharmacy": {"num": 30, "acc": 0.367}, "Public_Health": {"num": 30, "acc": 0.367}, "Overall-Humanities and Social Science": {"num": 120, "acc": 0.242}, "History": {"num": 30, "acc": 0.167}, "Literature": {"num": 30, "acc": 0.333}, "Sociology": {"num": 30, "acc": 0.2}, "Psychology": {"num": 30, "acc": 0.267}, "Overall-Tech and Engineering": {"num": 210, "acc": 0.276}, "Agriculture": {"num": 30, "acc": 0.2}, "Architecture_and_Engineering": {"num": 30, "acc": 0.167}, "Computer_Science": {"num": 30, "acc": 0.233}, "Electronics": {"num": 30, "acc": 0.267}, "Energy_and_Power": {"num": 30, "acc": 0.367}, "Materials": {"num": 30, "acc": 0.433}, "Mechanical_Engineering": {"num": 30, "acc": 0.267}, "Overall": {"num": 900, "acc": 0.257}}
Yes it also leads to OOM in my case. It seems to me, that it's not very easy to apply tp for hf models without introducing any third-party libraries, any suggestions? |
updated |
I think it is cause by too large |
|
After hf OOM has been solved, this PR can be merged. @zhaochenyang20 can you take a look about doc? |
|
sure. on this know |
|
Will merge this today. @mickqian @yizhang2077 |
Motivation
Modifications
Checklist