Rag | Page 2 of 7 | Towards Data Science

Why Care About Prompt Caching in LLMs?

Large Language Models

Optimizing the cost and latency of your LLM calls with Prompt Caching

Maria Mouschoutzi

March 13, 2026

12 min read

Scaling Vector Search: Comparing Quantization and Matryoshka Embeddings for 80% Cost Reduction

Machine Learning

Navigating the performance cliff: How pairing MRL with int8 and binary quantization balances infrastructure costs…

Oleg Tereshin

March 12, 2026

11 min read

Understanding Context and Contextual Retrieval in RAG

Large Language Models

Why traditional RAG loses context and how contextual retrieval dramatically improves retrieval accuracy

Maria Mouschoutzi

March 7, 2026

10 min read

RAG with Hybrid Search: How Does Keyword Search Work?

Machine Learning

Understanding keyword search, TF-IDF, and BM25

Maria Mouschoutzi

March 4, 2026

10 min read

Agentic RAG vs Classic RAG: From a Pipeline to a Control Loop

Large Language Models

A practical guide to choosing between single-pass pipelines and adaptive retrieval loops based on your…

Mostafa Ibrahim

March 3, 2026

11 min read

Zero-Waste Agentic RAG: Designing Caching Architectures to Minimize Latency and LLM Costs at Scale

Large Language Models

Reducing LLM costs by 30% with validation-aware, multi-tier caching

Partha Sarkar

March 1, 2026

19 min read

Optimizing Vector Search: Why You Should Flatten Structured Data

Machine Learning

An analysis of how flattening structured data can boost precision and recall by up to 20%

Oleg Tereshin

January 29, 2026

7 min read

How Cursor Actually Indexes Your Codebase

Large Language Models

Exploring the RAG pipeline in Cursor that powers code indexing and retrieval for coding agents

Kenneth Leung

January 26, 2026

10 min read

You Probably Don’t Need a Vector Database for Your RAG — Yet

Large Language Models

Numpy or SciKit-Learn might meet all your retrieval needs

Thomas Reid

January 20, 2026

14 min read

TDS Newsletter: Is It Time to Revisit RAG?

The Variable

Let’s make sense of the current state of retrieval-augmented generation

TDS Editors

January 15, 2026

3 min read