Skip to content

javiervela/llm-information-extraction-workshop

Repository files navigation

LLM Information Extraction Workshop

🚀 Learn to extract structured information with LLMs locally and at scale on CESGA GPUs.

This hands-on workshop teaches you how to run LLMs with Ollama, design effective prompts, validate outputs with Pydantic, and execute remote batch jobs on the CESGA FinisTerrae III cluster.


🎯 Learning Outcomes

By the end of this workshop, you will:

  • ✅ Run and interact with LLMs locally using Ollama.
  • ✅ Design and test prompts to extract structured information.
  • ✅ Parse and validate responses programmatically.
  • ✅ Run batch jobs on CESGA’s GPU cluster.

📋 Prerequisites

  • Basic knowledge of Python programming.
  • Familiarity with command-line operations.
  • Modules 1–3 and 5 can be done locally; Module 4 requires CESGA FinisTerrae III access.

🚀 Quick Start

  1. Module 1 – Set up Ollama locally and configure CESGA access.
  2. Module 2 – Run your first extraction jobs with Ollama.
  3. Module 3 – Validate and save structured outputs.
  4. Module 4 – Run your scripts on CESGA GPUs.
  5. Module 5 – Analyze long texts and interview transcripts.

📂 Repository Structure

Folder Description
01_setup/ Local LLM setup and CESGA access
02_basic_llm_extraction/ Basic local LLM queries & batch jobs
03_structured_llm_extraction/ Structured data extraction & validation
04_cluster_execution/ CESGA cluster job scripts
05_text_analysis/ Long text and interview analysis
data/ Sample texts and outputs

🔗 Navigation

Start Here: Module 1 – Setup & Environment

About

Hands-on training on extracting structured information with LLMs. Learn prompt design, JSON parsing, and batch processing through practical Python examples.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors