2025-07-17, 13:00, Martin Czygan -- UBL KI MEETUP #4, Applications / "Document Chat" / BIDC review
- Navigating AI ...
- Bibliothekskongress 2025 / AI TRACK, https://pdf-program.abstractserver.com/?congress=bid2025
- a desktop application to interact with LLM
- either a GPU or an API to access a remote model (available through GWDG, Leibniz, etc.)
Numerous applications to run chat with your documents locally:
- GPT4ALL ($17M, ...)
- ChatRTX
- AnythingLLM, GitHub, YCS22
- LMStudio
- Cherry Studio, Site
- ...
$ ./gpt4all/bin/chat
Index and embed a folder of PDFs.
Start dialogue.
- conventient, fast
- dependent of third party infra
- dynamische Entwicklung
- does GLAM not have enough images?
- ripgrep-all "KI"
Exploring a set of 22K docs and 500K pages with the help of AI
What can the machine learn from this corpus?
- let's cluster it into 2, 3, ..., 10 categories, what do we get?
- let's embed the documents into a vector database and see which records are similar
- let's embed paragraphs and see which documents are similar
Matching queries against documents.
- full text search
- natural language query; LLM and RAG
- natural language query to document query; fuzzy text, but exact query
- query in images
You can do this with any document set.
- expose a library catalog to a (local) chat interface
- make it so that we can find books, text, new items, but also images, digitized pages, and more (similarity search)
















