Pinned
🚨 Introducing "ColPali: Efficient Document Retrieval with Vision Language Models" !
We use Vision LLMs + late interaction to improve document retrieval (RAG, search engines, etc.), solely using the image representation of document pages ! arxiv.org/abs/2407.01449
🧵(1/N)











