Awesome LLMOps

A curated list of tools, frameworks, platforms, and resources for Large Language Model Operations (LLMOps) — enabling production-ready, scalable, and reliable LLM applications.

LLMOps is the emerging practice of managing the lifecycle of large language models, including fine-tuning, deployment, monitoring, evaluation, versioning, and observability — similar to MLOps but optimized for LLMs and generative AI systems.

Overview & Learning

LLMOps Guide (Weights & Biases) – High-level overview of LLMOps concepts and tools.
LLMOps Field Guide (Fiddler) – A breakdown of the infrastructure stack for LLMOps.
LangChain Cookbook – Recipes for building with LangChain and LLMs.
Full Stack Deep Learning – Practical LLM lifecycle, from training to deployment.

Model Training & Fine-Tuning

Hugging Face Transformers – Leading library for pre-trained and fine-tunable LLMs.
PEFT – Parameter-Efficient Fine-Tuning methods for LLMs.
LoRA – Lightweight fine-tuning strategy for large models.
Colossal-AI – Framework for efficient distributed LLM training.

Evaluation & Benchmarking

Open LLM Leaderboard – Benchmarking open LLMs.
Helm – Stanford’s framework for evaluating LLMs across tasks.
LM Evaluation Harness – Test harness for LLM evaluation.
TruLens – LLM observability and feedback tracking.

Serving & Inference

vLLM – Fast and memory-efficient inference for LLMs with continuous batching.
TGI (Text Generation Inference) – High-performance inference server by Hugging Face.
DeepSpeed MII – Low-latency inference for Hugging Face models.
Ray Serve – Scalable model serving via Ray.

Monitoring & Observability

PromptLayer – Log, monitor, and manage prompts across LLM providers.
Arize AI – LLM monitoring, evaluation, and prompt tracing.
Opik – Open-source LLM observability, evaluation, and tracing platform.
WhyLabs – Observability for ML and LLM deployments.
TruLens – Feedback loop framework for evaluating and improving LLM apps.

Prompt Engineering & Management

LangChain – Modular framework for chaining LLM calls and prompts.
Prompt Engineering Guide – Structured guide to writing effective prompts.
PromptFoo – Compare, test, and evaluate LLM prompts easily.
Guidance – Prompt programming with structured control over model output.

Data Management

Label Studio – Open-source data labeling for fine-tuning and RAG pipelines.
Weaviate – Vector database for semantic search and hybrid retrieval.
Pinecone – Managed vector DB for similarity search and retrieval-augmented generation.
ChromaDB – Open-source embeddings DB built for LLMs.

Security & Safety

Guardrails AI – Validating and controlling LLM outputs.
Rebuff – Open-source framework for prompt injection defense.
Giskard – Testing, debugging, and securing LLM applications.
OpenAI Moderation API – API for detecting harmful or unsafe content.

Platforms & Frameworks

LangChain – Infrastructure to build end-to-end LLM-powered apps.
LLamaIndex – Connect data sources to LLMs via indexing.
RAGStack (Haystack) – Retrieval-augmented generation framework.
FastChat – Open platform for serving and fine-tuning chat LLMs.

Tooling Ecosystem

Weights & Biases – Track and visualize model training and performance.
MLflow – Platform for managing the ML lifecycle.
PromptLayer – Middleware for logging and versioning prompt inputs and outputs.
OpenLLM – Open-source platform to deploy and manage LLMs in production.

Related Awesome Lists

Contribute

Contributions are welcome. Please ensure your submission fully follows the requirements outlined in CONTRIBUTING.md, including formatting, scope alignment, and category placement.

Pull requests that do not adhere to the contribution guidelines may be closed.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.editorconfig		.editorconfig
.gitattributes		.gitattributes
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
check_readme_links.py		check_readme_links.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Awesome LLMOps

Contents

Overview & Learning

Model Training & Fine-Tuning

Evaluation & Benchmarking

Serving & Inference

Monitoring & Observability

Prompt Engineering & Management

Data Management

Security & Safety

Platforms & Frameworks

Tooling Ecosystem

Related Awesome Lists

Contribute

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome LLMOps

Contents

Overview & Learning

Model Training & Fine-Tuning

Evaluation & Benchmarking

Serving & Inference

Monitoring & Observability

Prompt Engineering & Management

Data Management

Security & Safety

Platforms & Frameworks

Tooling Ecosystem

Related Awesome Lists

Contribute

License

About

Topics

Resources

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages