<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dumebi Okolo</title>
    <description>The latest articles on DEV Community by Dumebi Okolo (@dumebii).</description>
    <link>https://dev.to/dumebii</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F941720%2Ff316bf93-ef0b-4bc5-aee2-5e062255d5f0.jpg</url>
      <title>DEV Community: Dumebi Okolo</title>
      <link>https://dev.to/dumebii</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dumebii"/>
    <language>en</language>
    <item>
      <title>Demystifying RAG Architecture for Enterprise Data: A Technical Blueprint</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Fri, 10 Apr 2026 11:00:47 +0000</pubDate>
      <link>https://dev.to/dumebii/demystifying-rag-architecture-for-enterprise-data-a-technical-blueprint-393</link>
      <guid>https://dev.to/dumebii/demystifying-rag-architecture-for-enterprise-data-a-technical-blueprint-393</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;This article teaches how to engineer a robust Retrieval-Augmented Generation (RAG) pipeline to unlock LLM potential with proprietary information&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The advent of Large Language Models (LLMs) has ushered in a new era of AI-powered applications, promising to revolutionize how enterprises interact with information, automate tasks, and generate insights. From crafting marketing copy to summarizing complex legal documents, the capabilities of models like OpenAI's GPT series, Anthropic's Claude, and Meta's Llama have captured the imagination of developers and business leaders alike.&lt;/p&gt;

&lt;p&gt;However, the path from impressive public demos to practical, production-ready enterprise solutions is fraught with challenges. While LLMs excel at general knowledge tasks, their utility often diminishes when confronted with an organization's most valuable asset: its proprietary data.&lt;/p&gt;

&lt;p&gt;This is where Retrieval-Augmented Generation (RAG) architecture emerges as a critical enabler. RAG provides a robust, scalable, and cost-effective framework for connecting the immense generative power of LLMs with the specific, dynamic, and often sensitive knowledge locked within an enterprise's data silos. It addresses the inherent limitations of standalone LLMs, transforming them from general-purpose conversationalists into domain-specific experts.&lt;/p&gt;

&lt;p&gt;This article serves as a comprehensive technical blueprint for software engineers, data engineers, and technical product managers looking to build sophisticated AI features leveraging LLMs with private enterprise data. We will dissect the core problems LLMs face in an enterprise context, introduce the RAG paradigm, and meticulously walk through its three-step pipeline: ingestion and chunking, storage and semantic search, and context-aware generation. We'll also explore common pitfalls and provide actionable insights to ensure your RAG implementation is not just functional, but performant and reliable. By the end, you'll have a clear understanding of how to engineer a RAG solution that empowers your LLMs to speak with authority, accuracy, and relevance on your enterprise's terms.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Standalone LLMs
&lt;/h2&gt;

&lt;p&gt;Before diving into the solution, it's crucial to understand the fundamental limitations that prevent standard, off-the-shelf LLMs from being directly applicable to most enterprise use cases without significant augmentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Knowledge Cutoff Problem
&lt;/h3&gt;

&lt;p&gt;Large Language Models are trained on vast datasets of publicly available text and code. This training process is computationally intensive and takes a significant amount of time, meaning that once a model is released, its knowledge base is inherently static. This creates what's known as a knowledge cutoff. For example, an LLM released in early 2023 would have no inherent knowledge of events, products, or company policies that emerged later that year or in 2024.&lt;/p&gt;

&lt;p&gt;For enterprise applications, this limitation is critical. Organizations operate in dynamic environments where information changes constantly. An LLM relying solely on its pre-trained knowledge cannot answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  "What was our Q2 revenue performance for the current fiscal year?"&lt;/li&gt;
&lt;li&gt;  "What is the latest iteration of our employee expense policy?"&lt;/li&gt;
&lt;li&gt;  "Which customer accounts are currently in our new pilot program?"&lt;/li&gt;
&lt;li&gt;  "What are the technical specifications of our newly released product version 3.1?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are questions that demand real-time, proprietary, and often granular data. A standalone LLM, without external context, simply doesn't have access to this information, rendering it largely ineffective for internal business intelligence or operational support.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Hallucination Risk
&lt;/h3&gt;

&lt;p&gt;Perhaps even more concerning than a lack of knowledge is the phenomenon of hallucination. LLMs are sophisticated pattern-matching machines, not factual databases. They are designed to predict the most statistically probable next token based on their training data. When an LLM encounters a query about information it doesn't possess, especially if the query's structure is similar to questions it can answer, it doesn't respond with "I don't know." Instead, it confidently generates plausible-sounding but entirely fabricated information.&lt;/p&gt;

&lt;p&gt;In an enterprise context, hallucinations are not merely an inconvenience; they pose significant risks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Misinformation and Bad Decisions:&lt;/strong&gt; An LLM providing incorrect financial figures, outdated compliance advice, or non-existent product features can lead to flawed business strategies, operational errors, and reputational damage.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Erosion of Trust:&lt;/strong&gt; If users repeatedly receive inaccurate information, their trust in the AI system, and by extension, the underlying business process, will quickly diminish.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Legal and Compliance Exposure:&lt;/strong&gt; In regulated industries, incorrect AI-generated responses could lead to severe compliance violations, legal liabilities, and financial penalties.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Security Risks:&lt;/strong&gt; While less direct, a hallucinating LLM might inadvertently reveal sensitive patterns or generate seemingly innocuous but misleading data that could be exploited.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The core issue is that LLMs are trained to be generative, not necessarily truthful. They prioritize fluency and coherence over factual accuracy when lacking concrete information. This fundamental characteristic makes them unsuitable for direct deployment on proprietary tasks without a mechanism to ground their responses in verifiable, up-to-date data. This mechanism is precisely what Retrieval-Augmented Generation provides.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Retrieval-Augmented Generation (RAG)?
&lt;/h2&gt;

&lt;p&gt;Retrieval-Augmented Generation (RAG) is an architectural pattern designed to bridge the gap between the powerful generative capabilities of LLMs and the need for factual accuracy, recency, and domain-specificity in enterprise applications. At its heart, RAG is about providing an LLM with external, relevant, and verifiable information &lt;em&gt;at the time of inference&lt;/em&gt;, allowing it to generate responses that are grounded in truth rather than relying solely on its pre-trained, potentially outdated, or irrelevant knowledge.&lt;/p&gt;

&lt;p&gt;Think of RAG as giving an LLM an "open-book test." Instead of expecting the AI to answer purely from memory (its training data), we equip it with the ability to quickly look up the exact right documents or data snippets before formulating its answer. This fundamentally changes the LLM's role from a knowledge memorizer to a sophisticated knowledge synthesizer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Core Principle: Separate Retrieval from Generation
&lt;/h3&gt;

&lt;p&gt;The genius of RAG lies in its modular approach. It separates the challenge of &lt;em&gt;finding&lt;/em&gt; relevant information from the challenge of &lt;em&gt;generating&lt;/em&gt; a coherent, human-like response. This separation offers several key advantages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Factuality:&lt;/strong&gt; By providing specific, up-to-date context, RAG significantly reduces the likelihood of hallucinations, as the LLM is instructed to base its answer &lt;em&gt;only&lt;/em&gt; on the provided information.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Recency:&lt;/strong&gt; New information can be added to the external knowledge base in real-time, without needing to retrain or fine-tune the LLM. This makes RAG highly agile for dynamic enterprise data.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Domain Specificity:&lt;/strong&gt; The external knowledge base can be tailored precisely to an organization's proprietary data, enabling LLMs to become experts in niche domains where they previously had no knowledge.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Cost-Effectiveness:&lt;/strong&gt; RAG is generally far more cost-effective than repeatedly fine-tuning LLMs for new or updated information. Fine-tuning is expensive, time-consuming, and can lead to 'catastrophic forgetting' of general knowledge. RAG simply updates the knowledge base.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Interpretability/Attribution:&lt;/strong&gt; Because the LLM's response is grounded in retrieved documents, it's often possible to cite the sources, improving trust and auditability.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In essence, RAG transforms an LLM from a general-purpose oracle into a highly specialized, context-aware agent capable of interacting intelligently with an organization's most critical information assets. It allows enterprises to leverage the cutting-edge of generative AI without compromising on accuracy, relevance, or control over their data.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core RAG Architecture (The 3-Step Pipeline)
&lt;/h2&gt;

&lt;p&gt;Building a robust RAG system involves a sequential, multi-component pipeline. While implementations can vary in complexity, the core architecture typically comprises three distinct, yet interconnected, stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Ingestion &amp;amp; Chunking:&lt;/strong&gt; Preparing your enterprise data for retrieval.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Storage &amp;amp; Semantic Search:&lt;/strong&gt; Efficiently storing and retrieving relevant data.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Generation (The Prompt Context):&lt;/strong&gt; Using retrieved data to inform the LLM's response.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's visualize this flow: A user submits a query. This query is used to search a specialized knowledge base (often a vector database) for relevant information. The retrieved information, alongside the original query, is then sent to the LLM, which synthesizes a grounded answer. This process ensures the LLM is always operating with the most relevant and up-to-date context available.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Ingestion &amp;amp; Chunking
&lt;/h3&gt;

&lt;p&gt;This initial phase is critical for preparing your raw enterprise data for efficient retrieval. It involves extracting information from various sources, processing it, and transforming it into a format suitable for semantic search.&lt;/p&gt;

&lt;h4&gt;
  
  
  Data Sources &amp;amp; Preprocessing
&lt;/h4&gt;

&lt;p&gt;Your enterprise data can reside in a multitude of formats and locations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Documents:&lt;/strong&gt; PDFs, Word documents (.docx), Markdown files, HTML pages (e.g., Confluence, SharePoint).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Databases:&lt;/strong&gt; SQL databases, NoSQL databases (e.g., customer records, product catalogs).&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Communication Platforms:&lt;/strong&gt; Slack archives, email threads, CRM notes.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Code Repositories:&lt;/strong&gt; Git repositories (for code documentation, internal libraries).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The first step is to extract the raw text content from these diverse sources. This often involves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Parsing:&lt;/strong&gt; Using libraries (e.g., &lt;code&gt;PyPDF2&lt;/code&gt;, &lt;code&gt;python-docx&lt;/code&gt;, &lt;code&gt;BeautifulSoup&lt;/code&gt;) to extract text from structured and semi-structured documents.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Optical Character Recognition (OCR):&lt;/strong&gt; For scanned PDFs or image-based documents, OCR tools are essential to convert images of text into machine-readable text.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cleaning:&lt;/strong&gt; Removing boilerplate text (headers, footers, navigation), irrelevant metadata, excessive whitespace, or corrupted characters.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Standardization:&lt;/strong&gt; Converting all text to a consistent encoding (e.g., UTF-8) and potentially normalizing capitalization or punctuation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Chunking Strategy: Breaking Down Knowledge
&lt;/h4&gt;

&lt;p&gt;LLMs have a finite context window – the maximum number of tokens they can process in a single prompt. Enterprise documents can be lengthy, far exceeding these limits. Moreover, sending an entire document for every query is inefficient and often introduces noise. Therefore, the extracted text needs to be broken down into smaller, manageable units called chunks.&lt;/p&gt;

&lt;p&gt;Effective chunking is an art and a science. Poor chunking can lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Lost Context:&lt;/strong&gt; If chunks are too small, essential information might be split across multiple chunks, making it difficult for the LLM to understand the complete picture.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Irrelevant Information:&lt;/strong&gt; If chunks are too large, they might contain a lot of irrelevant text, diluting the signal and potentially confusing the LLM.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Common chunking strategies include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Fixed-Size Chunking:&lt;/strong&gt; Splitting text into chunks of a predefined character or token count (e.g., 500 characters) with a specified overlap (e.g., 50 characters). Overlap helps maintain context across chunk boundaries.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Sentence/Paragraph Chunking:&lt;/strong&gt; Splitting text at natural linguistic breaks (sentences, paragraphs). This often results in more semantically coherent chunks than fixed-size methods.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Recursive Character Text Splitter:&lt;/strong&gt; A common approach (found in libraries like LangChain) that attempts to split by paragraphs, then sentences, then words, until chunks fit a specified size, ensuring semantic boundaries are prioritized.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Semantic Chunking:&lt;/strong&gt; A more advanced technique where chunks are created based on semantic similarity. Text is embedded, and then a clustering algorithm or other method identifies natural breaks where the meaning shifts significantly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Best Practice:&lt;/strong&gt; Experiment with different chunk sizes and overlap values. A chunk size of 200-1000 tokens with 10-20% overlap is a common starting point, but the optimal values depend heavily on your specific data and use case.&lt;/p&gt;

&lt;h4&gt;
  
  
  Embedding Generation: The Language of Similarity
&lt;/h4&gt;

&lt;p&gt;Once your data is chunked, the next crucial step is to transform each text chunk into a numerical representation called an embedding.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;What are Embeddings?&lt;/strong&gt; Embeddings are high-dimensional vectors (lists of numbers, e.g., 1536 dimensions for models like OpenAI's text-embedding-3-small or open-source alternatives) that capture the semantic meaning of text. Texts with similar meanings will have vectors that are numerically 'close' to each other in this high-dimensional space.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;How they are Generated:&lt;/strong&gt; An embedding model (e.g., OpenAI's text-embedding-3-small, various Sentence Transformers models from Hugging Face, Cohere Embed) takes a piece of text as input and outputs its corresponding vector.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Importance:&lt;/strong&gt; Embeddings are the backbone of semantic search. They allow us to move beyond keyword matching and find information based on conceptual similarity. For instance, a query about "remote work policy" could retrieve documents mentioning "telecommuting guidelines" because their embeddings are semantically close.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each chunk of text from your enterprise data is processed by an embedding model, and its resulting vector is stored. This collection of vectors, along with references to their original text chunks, forms the core of your searchable knowledge base.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Storage &amp;amp; Semantic Search (The Vector DB)
&lt;/h3&gt;

&lt;p&gt;With your enterprise data processed into chunks and vectorized, the next step is to store these embeddings efficiently and enable rapid, accurate semantic search. This is the domain of the Vector Database.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Role of a Vector Database
&lt;/h4&gt;

&lt;p&gt;A vector database is purpose-built for storing, indexing, and querying high-dimensional vectors. Unlike traditional relational databases that excel at structured queries (e.g., &lt;code&gt;SELECT * FROM users WHERE age &amp;gt; 30&lt;/code&gt;), vector databases specialize in 'similarity search' – finding vectors that are numerically closest to a given query vector.&lt;/p&gt;

&lt;h4&gt;
  
  
  How Semantic Search Works
&lt;/h4&gt;

&lt;p&gt;When a user submits a query (e.g., "How do I request time off?"):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Query Embedding:&lt;/strong&gt; The user's query is first sent to the &lt;em&gt;same embedding model&lt;/em&gt; that was used to embed your enterprise data chunks. This transforms the natural language query into a query vector.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Vector Similarity Search:&lt;/strong&gt; The query vector is then sent to the vector database. The database's indexing algorithms (e.g., Hierarchical Navigable Small Worlds (HNSW), Inverted File Index (IVF), Locality-Sensitive Hashing (LSH)) efficiently compare the query vector to all stored document chunk vectors.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Distance Metrics:&lt;/strong&gt; This comparison typically uses distance metrics like:

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Cosine Similarity:&lt;/strong&gt; Measures the cosine of the angle between two vectors. A value of 1 indicates identical direction (perfect similarity), 0 indicates orthogonality (no similarity), and -1 indicates opposite direction.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Euclidean Distance:&lt;/strong&gt; Measures the straight-line distance between two points in Euclidean space. Smaller distance implies greater similarity.
The vector database returns the 'top-K' most similar document chunk vectors, where 'K' is a configurable parameter (e.g., retrieve the 5 most relevant chunks).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Retrieval of Original Text:&lt;/strong&gt; Along with the similar vectors, the vector database also retrieves the original text content of the corresponding chunks.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Popular Vector Database Options
&lt;/h4&gt;

&lt;p&gt;The choice of vector database depends on factors like scale, latency requirements, deployment model (managed vs. self-hosted), and ecosystem integration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Managed Services:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Pinecone:&lt;/strong&gt; A cloud-native, fully managed vector database known for its scalability and ease of use.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Weaviate:&lt;/strong&gt; An open-source, cloud-native vector database that also offers a managed service, supporting GraphQL and semantic search.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Qdrant:&lt;/strong&gt; Another open-source vector search engine, available as self-hosted or managed, known for its speed and advanced filtering capabilities.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;  &lt;strong&gt;Self-Hosted/Open Source:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Milvus:&lt;/strong&gt; A widely adopted open-source vector database designed for massive-scale vector similarity search.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Chroma:&lt;/strong&gt; A lightweight, easy-to-use open-source embedding database, great for local development and smaller-scale applications.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;pgvector:&lt;/strong&gt; An extension for PostgreSQL that enables efficient vector similarity search directly within a relational database. Excellent for scenarios where you want to keep your vector data alongside your existing structured data.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Advanced Retrieval Strategies
&lt;/h4&gt;

&lt;p&gt;Simple top-K retrieval is a good start, but for complex enterprise data, more sophisticated strategies can enhance relevance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Re-ranking:&lt;/strong&gt; After an initial retrieval of, say, 20 chunks, a smaller, more powerful re-ranking model (often a cross-encoder or a specialized LLM) can evaluate the relevance of these chunks more deeply against the query and re-order them, selecting the absolute best 'K' for the LLM.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Hybrid Search:&lt;/strong&gt; Combining semantic (vector) search with traditional keyword-based search (e.g., BM25) can provide a more robust retrieval system. Keyword search excels at finding exact matches or rare terms, while semantic search handles conceptual understanding.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Multi-query Retrieval:&lt;/strong&gt; Generating multiple slightly different queries from the original user query (e.g., using an LLM) and running parallel searches to broaden the retrieval scope.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Contextual Compression:&lt;/strong&gt; Filtering or summarizing retrieved documents to only include the most relevant sentences or paragraphs, reducing noise and optimizing token usage for the LLM.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step 3: Generation (The Prompt Context)
&lt;/h3&gt;

&lt;p&gt;This is the final stage where the LLM synthesizes an answer, critically informed by the context retrieved from your vector database.&lt;/p&gt;

&lt;h4&gt;
  
  
  Constructing the Augmented Prompt
&lt;/h4&gt;

&lt;p&gt;The core idea here is to inject the retrieved document chunks directly into the LLM's prompt. This creates an 'augmented prompt' that provides the LLM with all the necessary information to answer the user's question accurately and without hallucination.&lt;/p&gt;

&lt;p&gt;A typical augmented prompt structure looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Placeholder for a simplified LangChain-like RAG snippet
&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.prompts&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.runnables&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RunnablePassthrough&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.output_parsers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StrOutputParser&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langchain_core.documents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Document&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the LLM (using a sample configuration)
# from langchain_openai import ChatOpenAI
# llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
&lt;/span&gt;
&lt;span class="c1"&gt;# A simple retriever mock for demonstration. In a real RAG system, this would
# embed the question, query a vector DB, and return Document objects.
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MockRetriever&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_relevant_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="c1"&gt;# In a real scenario, this would query the vector DB
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;remote work expenses&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
                &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The company&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s remote work expense policy allows reimbursement for internet and utilities up to $50/month.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Employees must submit expense reports by the 15th of the following month for remote work related costs.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No specific information found on that topic in the internal knowledge base.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="n"&gt;mock_retriever&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MockRetriever&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# 1. Define the prompt template
# This template instructs the LLM on its role and how to use the provided context.
&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are an expert assistant for a large enterprise.
Answer the user&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s question based *only* on the provided context.
If the answer cannot be found in the context, politely state that you do not have enough information.

Context:
{context}

Question:
{question}
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ChatPromptTemplate&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 2. Format retrieved documents into a single context string
# This is crucial: the retriever returns Document objects, but the prompt expects a formatted string.
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Serialize retrieved documents into a single context string.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;page_content&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 3. Define the RAG chain (using LangChain's Runnable interface for clarity)
# The 'context' key is populated by the retriever and formatted into a string, 
# and 'question' by the user's input.
&lt;/span&gt;&lt;span class="n"&gt;rag_chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mock_retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_relevant_documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])),&lt;/span&gt; 
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;  &lt;span class="c1"&gt;# Your initialized LLM instance goes here (e.g., ChatOpenAI model above)
&lt;/span&gt;    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 4. Invoke the chain with a user query
# from langchain_openai import ChatOpenAI # Example LLM initialization
# llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
# response = rag_chain.invoke({"question": "What is the policy for remote work expenses?"})
# print(response)
# This would print: "The company's remote work expense policy allows reimbursement for internet and utilities up to $50/month. Employees must submit expense reports by the 15th of the following month for remote work related costs."
&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key elements of the prompt template:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;System Message/Role:&lt;/strong&gt; Sets the persona and instructions for the LLM (e.g., "You are an expert assistant...").&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Context Placeholder (&lt;code&gt;{context}&lt;/code&gt;):&lt;/strong&gt; This is where the retrieved document chunks are inserted. It's crucial to clearly delineate the context from the actual question.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Instruction for Context Usage:&lt;/strong&gt; Explicitly telling the LLM to &lt;em&gt;only&lt;/em&gt; use the provided context and to state if the answer is not found is vital to prevent hallucination.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Question Placeholder (&lt;code&gt;{question}&lt;/code&gt;):&lt;/strong&gt; The user's original query.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  LLM Interaction and Synthesis
&lt;/h4&gt;

&lt;p&gt;Once the augmented prompt is constructed, it is sent to the chosen LLM (e.g., GPT-4 Turbo, Claude 3.5 Sonnet, or open-source alternatives like Llama 3). The LLM then processes this entire prompt, using the provided context to formulate a relevant and accurate answer. Because the context is explicitly given, the LLM acts more like a sophisticated summarizer and question-answering system over the provided text, rather than generating from its internal, general knowledge.&lt;/p&gt;

&lt;p&gt;This final step ensures that the LLM's response is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Grounded:&lt;/strong&gt; Directly supported by the retrieved enterprise data.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Relevant:&lt;/strong&gt; Addresses the user's specific query.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Accurate:&lt;/strong&gt; Minimizes hallucination by constraining the LLM's generation to the facts presented in the context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By following this three-step pipeline, enterprises can transform generic LLMs into powerful, domain-specific AI assistants that deliver reliable and actionable intelligence from their most valuable data assets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Pitfalls in RAG Engineering
&lt;/h2&gt;

&lt;p&gt;While RAG offers a powerful solution, its effective implementation requires careful consideration and engineering rigor. Several common pitfalls can undermine the performance and reliability of a RAG system if not addressed proactively.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Suboptimal Chunking Strategies
&lt;/h3&gt;

&lt;p&gt;As discussed, chunking is foundational, and mistakes here cascade through the entire pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Chunks that are too small:&lt;/strong&gt; If chunks are excessively granular (e.g., single sentences), they might lack sufficient context to be meaningful on their own. The semantic meaning required to answer a complex question could be fragmented across multiple disparate chunks, making retrieval difficult or incomplete.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Chunks that are too large:&lt;/strong&gt; Conversely, chunks that are too long introduce noise. They might contain a lot of irrelevant information alongside the relevant bits, diluting the signal for the embedding model and increasing the chances of retrieving less precise context. Large chunks also consume more tokens in the LLM's context window, increasing inference cost and potentially hitting context limits prematurely.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Poor Overlap:&lt;/strong&gt; Insufficient overlap between sequential chunks can lead to critical information being split precisely at the boundary, making it hard for retrieval to capture the complete idea.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Mitigation:&lt;/strong&gt; Experimentation is key. Develop an evaluation pipeline to test different chunk sizes, overlap strategies, and chunking methods (e.g., fixed-size vs. recursive vs. semantic) against a diverse set of representative queries. Consider specialized chunking based on document structure (e.g., splitting by headings, sections in a PDF). For highly structured data, consider 'parent-child' or 'summary' chunking where smaller chunks are linked to larger, more contextual parent chunks or summaries for different retrieval stages.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Irrelevant or Insufficient Retrieval
&lt;/h3&gt;

&lt;p&gt;Even with good chunking, the retriever component can fail to provide the LLM with the optimal context:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Poor Embedding Model Choice:&lt;/strong&gt; Not all embedding models are created equal, and some perform better on specific domains or languages. Using a generic embedding model for highly specialized enterprise terminology might lead to embeddings that don't accurately capture semantic similarity, resulting in irrelevant retrievals.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Noisy or Low-Quality Data in Vector DB:&lt;/strong&gt; If your ingested data contains outdated, contradictory, or simply poorly written information, the vector database will retrieve it, and the LLM will struggle to synthesize a coherent, accurate answer. 'Garbage in, garbage out' applies acutely here.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Suboptimal &lt;code&gt;k&lt;/code&gt; Value:&lt;/strong&gt; Retrieving too few chunks (&lt;code&gt;k&lt;/code&gt; is too low) might mean missing critical pieces of information. Retrieving too many chunks (&lt;code&gt;k&lt;/code&gt; is too high) introduces irrelevant information into the LLM's context, potentially confusing it or causing it to misinterpret the core question.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Mitigation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Embedding Model Evaluation:&lt;/strong&gt; Test different embedding models for your specific domain. Consider fine-tuning an open-source embedding model on your proprietary data if off-the-shelf options underperform.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Data Quality Management:&lt;/strong&gt; Implement robust data cleansing, deduplication, and versioning strategies for your source documents. Only ingest high-quality, current, and relevant data into your RAG knowledge base.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Advanced Retrieval Techniques:&lt;/strong&gt; Employ re-ranking models to refine the initial top-K results. Utilize hybrid search (keyword + vector) to capture both exact matches and semantic similarity. Explore multi-query strategies to generate a more comprehensive set of retrieved documents.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Latency Issues
&lt;/h3&gt;

&lt;p&gt;RAG introduces additional steps in the query processing pipeline, which can impact response times:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Slow Query Embedding:&lt;/strong&gt; Converting the user's query into a vector can take time, especially if the embedding model is large or running on under-provisioned hardware.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Slow Vector Database Lookups:&lt;/strong&gt; As the size of your vector database grows (millions or billions of vectors), similarity search can become a bottleneck if indexing is inefficient or the database is not properly scaled.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;LLM Inference Latency:&lt;/strong&gt; Even with optimized context, the LLM's generation step can be slow, especially for larger, more capable models (e.g., GPT-4) or for very long responses.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Mitigation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Optimize Embedding Models:&lt;/strong&gt; Choose embedding models that balance performance and accuracy. For query embedding, consider smaller, faster models if acceptable. Implement caching for frequently asked questions.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Vector DB Optimization:&lt;/strong&gt; Ensure your vector database is correctly indexed (e.g., using HNSW or IVF) and adequately resourced. Explore cloud-native managed vector databases that handle scalability automatically. Consider sharding your vector index for very large datasets.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;LLM Choice and Optimization:&lt;/strong&gt; Select an LLM that meets your latency and quality requirements. For internal applications where cost and speed are paramount, smaller open-source models might be preferable to larger, more expensive cloud models. Implement streaming responses from the LLM where possible to improve perceived latency.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Prompt Engineering Failures
&lt;/h3&gt;

&lt;p&gt;Even with perfect retrieval, a poorly constructed prompt can lead to suboptimal LLM responses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Vague or Ambiguous Instructions:&lt;/strong&gt; If the prompt doesn't clearly define the LLM's role, desired output format, or constraints, the LLM might deviate from expectations.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Failure to Constrain to Context:&lt;/strong&gt; Forgetting to explicitly instruct the LLM to &lt;em&gt;only&lt;/em&gt; use the provided context (e.g., "Answer only from the context provided. If the answer is not in the context, state that you don't know.") is a common mistake that reintroduces hallucination risk.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Context Window Overflow:&lt;/strong&gt; If the combined length of the prompt, retrieved chunks, and the expected response exceeds the LLM's maximum context window, the model will truncate the input, leading to incomplete or erroneous answers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Mitigation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Clear and Concise System Prompts:&lt;/strong&gt; Define the LLM's persona and task unambiguously. Use clear delimiters for context and questions.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Explicit Guardrails:&lt;/strong&gt; Always include instructions to strictly adhere to the provided context and to admit when information is not available.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Dynamic Context Management:&lt;/strong&gt; Implement logic to truncate or summarize retrieved chunks if their combined length approaches the LLM's context window limit. Prioritize the most relevant chunks in such scenarios. Evaluate the impact of different context lengths on LLM performance.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Few-Shot Examples:&lt;/strong&gt; For specific response formats or nuanced tasks, providing one or two examples within the prompt can guide the LLM more effectively.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Addressing these common pitfalls requires a holistic approach, combining careful data engineering, robust infrastructure, and iterative prompt design. Continuous monitoring and evaluation are essential to ensure your RAG system consistently delivers accurate and performant results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion &amp;amp; Next Steps
&lt;/h2&gt;

&lt;p&gt;The journey from generic LLMs to powerful, domain-specific AI applications for enterprise data is fundamentally paved by Retrieval-Augmented Generation. RAG architecture is not merely an enhancement; it is a transformative paradigm that addresses the core limitations of pre-trained LLMs – their knowledge cutoff and propensity for hallucination – making them truly viable for critical business functions.&lt;/p&gt;

&lt;p&gt;By systematically ingesting and chunking proprietary data, transforming it into semantically rich embeddings, storing it in high-performance vector databases, and then intelligently augmenting LLM prompts with retrieved context, enterprises can unlock unprecedented capabilities. RAG offers a cost-effective, agile, and scalable alternative to expensive model fine-tuning, allowing organizations to keep their AI systems current with rapidly evolving internal knowledge.&lt;/p&gt;

&lt;p&gt;This article has provided a comprehensive technical blueprint, detailing the motivations, core components, and common challenges in engineering a robust RAG pipeline. The principles outlined here – from meticulous data preparation and strategic chunking to efficient vector search and precise prompt engineering – are the bedrock of successful RAG implementations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Ready to Build Your First RAG Application?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Explore Frameworks:&lt;/strong&gt; Dive into open-source frameworks like &lt;a href="https://www.langchain.com/" rel="noopener noreferrer"&gt;LangChain&lt;/a&gt; and &lt;a href="https://www.llamaindex.ai/" rel="noopener noreferrer"&gt;LlamaIndex&lt;/a&gt;. These libraries provide high-level abstractions for building RAG pipelines, simplifying integration with various LLMs, embedding models, and vector databases.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Experiment with Vector Databases:&lt;/strong&gt; Set up a local instance of &lt;a href="https://www.trychroma.com/" rel="noopener noreferrer"&gt;Chroma&lt;/a&gt; or &lt;a href="https://github.com/pgvector/pgvector" rel="noopener noreferrer"&gt;pgvector&lt;/a&gt; to get hands-on experience, or explore managed services like &lt;a href="https://www.pinecone.io/" rel="noopener noreferrer"&gt;Pinecone&lt;/a&gt; for scalability.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Start Small, Iterate Fast:&lt;/strong&gt; Begin with a small, manageable dataset from your enterprise. Focus on getting a basic RAG pipeline operational, then iteratively refine your chunking, retrieval, and prompt strategies based on real-world queries and evaluation metrics.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Continuous Learning:&lt;/strong&gt; The RAG landscape is evolving rapidly. Stay updated with the latest research in retrieval techniques, embedding models, and multi-modal RAG. Consider exploring advanced topics like agentic RAG, where LLMs can dynamically decide when and how to retrieve information.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;RAG empowers you to transform LLMs from generalists into trusted, domain-expert collaborators, enabling your enterprise to harness the full potential of generative AI with confidence and accuracy. The future of enterprise AI is augmented, and RAG is your blueprint to building it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Feedback &amp;amp; Community
&lt;/h2&gt;

&lt;p&gt;We believe in transparent, community-driven content creation. This article was generated using the &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;Ozigi Dashboard&lt;/a&gt; – our advanced longform content generation platform – and has been thoroughly reviewed and refined by our engineering team.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Have feedback on this article?&lt;/strong&gt; We'd love to hear your thoughts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Leave a comment below or email us at &lt;a href="mailto:hello@ozigi.app"&gt;hello@ozigi.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Share your RAG architecture experiences and learnings with our community&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Interested in building your own enterprise AI content?&lt;/strong&gt; Longform article generation is available to users on the Organization tier, limited to 5 articles per day. &lt;a href="https://ozigi.app/pricing" rel="noopener noreferrer"&gt;Check our pricing details&lt;/a&gt; to learn more about what Ozigi can do for your content strategy.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>programming</category>
      <category>javascript</category>
    </item>
    <item>
      <title>Building a Robust Webhook Handler in Node.js: Validation, Queuing, and Retry Logic</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Tue, 07 Apr 2026 11:50:28 +0000</pubDate>
      <link>https://dev.to/dumebii/building-a-robust-webhook-handler-in-nodejs-validation-queuing-and-retry-logic-2fb6</link>
      <guid>https://dev.to/dumebii/building-a-robust-webhook-handler-in-nodejs-validation-queuing-and-retry-logic-2fb6</guid>
      <description>&lt;p&gt;Webhooks are everywhere. &lt;a href="https://stripe.com/docs/webhooks" rel="noopener noreferrer"&gt;Stripe&lt;/a&gt; fires one when a payment succeeds. &lt;a href="https://docs.github.com/en/webhooks" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; fires one when a PR is merged. &lt;a href="https://www.twilio.com/docs/usage/webhooks" rel="noopener noreferrer"&gt;Twilio&lt;/a&gt; fires one when an SMS lands. And when your handler is flaky — when it misses events, fails silently, or chokes under load — you lose data and trust.&lt;/p&gt;

&lt;p&gt;Most tutorials show you how to receive a webhook. Few show you how to handle it &lt;em&gt;properly&lt;/em&gt;. This article covers the full picture: signature validation, idempotency, async queuing, and retry logic with exponential backoff.&lt;/p&gt;

&lt;p&gt;We'll use Node.js and Express throughout, with no external queue infrastructure required. &lt;strong&gt;One important caveat up front:&lt;/strong&gt; the queuing approach in this article is designed for a single, long-lived Node.js process. If you're running on serverless functions (Lambda, Cloud Run) or horizontally scaled deployments with multiple instances, in-memory queues are not reliable — skip ahead to the When to Upgrade section for the right tool in those cases.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concern&lt;/th&gt;
&lt;th&gt;Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fake webhook senders&lt;/td&gt;
&lt;td&gt;HMAC-SHA256 signature verification with &lt;code&gt;timingSafeEqual&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Slow handlers timing out&lt;/td&gt;
&lt;td&gt;Acknowledge &lt;code&gt;200&lt;/code&gt; immediately, process async&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cascading failures&lt;/td&gt;
&lt;td&gt;In-process queue with concurrency limit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transient errors&lt;/td&gt;
&lt;td&gt;Exponential backoff with jitter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Duplicate events&lt;/td&gt;
&lt;td&gt;Idempotency keys via Set or Redis&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What We're Building
&lt;/h2&gt;

&lt;p&gt;A webhook handler that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Validates&lt;/strong&gt; the request signature (so only legitimate senders get through)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Acknowledges fast&lt;/strong&gt; (returns &lt;code&gt;200&lt;/code&gt; immediately, does the work async)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queues events&lt;/strong&gt; in-process so the work doesn't block the HTTP layer&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retries failures&lt;/strong&gt; with exponential backoff&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handles duplicates&lt;/strong&gt; with idempotency keys&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Step 1: Signature Validation
&lt;/h2&gt;

&lt;p&gt;Never trust an incoming webhook without verifying it came from who you think it came from. Most webhook providers (&lt;a href="https://stripe.com/docs/webhooks/signature-verification" rel="noopener noreferrer"&gt;Stripe&lt;/a&gt;, &lt;a href="https://docs.github.com/en/webhooks/using-webhooks/validating-webhook-deliveries" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, &lt;a href="https://shopify.dev/docs/apps/build/webhooks/secure/validate-webhooks" rel="noopener noreferrer"&gt;Shopify&lt;/a&gt;) sign their payloads using &lt;a href="https://en.wikipedia.org/wiki/HMAC" rel="noopener noreferrer"&gt;HMAC-SHA256&lt;/a&gt; with a shared secret.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy99q3gf1iy1xx022bw5n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy99q3gf1iy1xx022bw5n.png" alt="Webhook pipeline flow — from incoming request through validation, queuing, handling, retry and dead letter" width="800" height="462"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;crypto&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;verifySignature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createHmac&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sha256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Use timingSafeEqual to prevent timing attacks&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expectedBuffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`sha256=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signatureBuffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;expectedBuffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;signatureBuffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;timingSafeEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;expectedBuffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;signatureBuffer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why &lt;a href="https://nodejs.org/api/crypto.html#cryptotimingsafeequala-b" rel="noopener noreferrer"&gt;&lt;code&gt;timingSafeEqual&lt;/code&gt;&lt;/a&gt;?&lt;/strong&gt; A simple &lt;code&gt;===&lt;/code&gt; check leaks timing information — an attacker can brute-force signatures by measuring how long the comparison takes. &lt;code&gt;timingSafeEqual&lt;/code&gt; always takes the same amount of time regardless of where the strings differ.&lt;/p&gt;

&lt;p&gt;Now wire it into &lt;a href="https://expressjs.com" rel="noopener noreferrer"&gt;Express&lt;/a&gt;. A critical detail: you need the &lt;strong&gt;raw body&lt;/strong&gt; for HMAC validation, not the parsed JSON. Express's &lt;a href="https://expressjs.com/en/api.html#express.json" rel="noopener noreferrer"&gt;&lt;code&gt;json()&lt;/code&gt; middleware&lt;/a&gt; strips the raw body by default — use &lt;a href="https://expressjs.com/en/api.html#express.raw" rel="noopener noreferrer"&gt;&lt;code&gt;express.raw()&lt;/code&gt;&lt;/a&gt; on the webhook route instead.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Store raw body before parsing&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/webhook&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/webhook&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-hub-signature-256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="c1"&gt;// GitHub format&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rawBody&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Buffer, because of express.raw()&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nf"&gt;verifySignature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rawBody&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;WEBHOOK_SECRET&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Invalid signature&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rawBody&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Acknowledge immediately — do the work async&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key discipline here: &lt;strong&gt;acknowledge before you process&lt;/strong&gt;. If your business logic takes 2 seconds and the sender has a 1-second timeout, you'll get duplicate events.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 2: An In-Process Job Queue
&lt;/h2&gt;

&lt;p&gt;You don't always need &lt;a href="https://redis.io" rel="noopener noreferrer"&gt;Redis&lt;/a&gt; or &lt;a href="https://bullmq.io" rel="noopener noreferrer"&gt;BullMQ&lt;/a&gt; for a job queue. For a &lt;strong&gt;single, persistent Node.js process&lt;/strong&gt;, an in-process queue with controlled concurrency is enough — and it's simpler to reason about.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;⚠️ &lt;strong&gt;Limitations to understand before using this pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Jobs are lost on restart.&lt;/strong&gt; If your process crashes or is redeployed while events are queued, those jobs disappear silently. There is no persistence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not shared across instances.&lt;/strong&gt; If you run multiple server instances (behind a load balancer, in a cluster, or in any horizontally scaled setup), each instance has its own queue. Events are not distributed or deduplicated across them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If either of those constraints is a problem for your use case, go straight to a real queue like &lt;a href="https://bullmq.io" rel="noopener noreferrer"&gt;BullMQ&lt;/a&gt; or &lt;a href="https://aws.amazon.com/sqs/" rel="noopener noreferrer"&gt;AWS SQS&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WebhookQueue&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;concurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;maxRetries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;concurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;concurrency&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxRetries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;concurrency&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// pick up the next job&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handleEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Retrying event &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; in &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;ms (attempt &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Event &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; failed after &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; attempts`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="c1"&gt;// Send to dead-letter store, alert, etc.&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Exponential backoff with jitter&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="nx"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;jitter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;jitter&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WebhookQueue&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;backoff&lt;/code&gt; method uses &lt;strong&gt;exponential backoff with jitter&lt;/strong&gt;. Without jitter, all retrying jobs fire at the same moment and create a &lt;a href="https://en.wikipedia.org/wiki/Thundering_herd_problem" rel="noopener noreferrer"&gt;thundering herd&lt;/a&gt;. Adding a random jitter spreads the load. See &lt;a href="https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/" rel="noopener noreferrer"&gt;AWS's writeup on backoff and jitter&lt;/a&gt; for a deeper look at why this matters at scale.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s98kqkanagh7up76s08.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5s98kqkanagh7up76s08.png" alt="Exponential backoff with jitter — delay per retry attempt" width="800" height="409"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 3: The Event Handler
&lt;/h2&gt;

&lt;p&gt;This is where your actual business logic lives. Keep it focused — one function per event type.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;payment.succeeded&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handlePaymentSucceeded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user.created&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handleUserCreated&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Unhandled event type: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handlePaymentSucceeded&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// e.g., upgrade account, send receipt, update DB&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;paid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;emailService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendReceipt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;customerEmail&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 4: Idempotency
&lt;/h2&gt;

&lt;p&gt;Webhook senders &lt;em&gt;will&lt;/em&gt; send duplicates. Network timeouts, retries on their end, and at-least-once delivery guarantees mean you'll see the same event ID more than once.&lt;/p&gt;

&lt;p&gt;Your handler needs to be &lt;strong&gt;idempotent&lt;/strong&gt; — processing the same event twice should have the same effect as processing it once.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;processedEvents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Use Redis in production&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;processedEvents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Skipping duplicate event: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;processedEvents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// ... your handlers&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In production, replace the in-memory &lt;code&gt;Set&lt;/code&gt; with a &lt;a href="https://redis.io" rel="noopener noreferrer"&gt;Redis&lt;/a&gt; &lt;code&gt;SET NX EX&lt;/code&gt; call via &lt;a href="https://github.com/redis/ioredis" rel="noopener noreferrer"&gt;ioredis&lt;/a&gt; so idempotency survives process restarts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ioredis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;isAlreadyProcessed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;eventId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// SET key value NX EX seconds&lt;/span&gt;
  &lt;span class="c1"&gt;// NX = only set if not exists; EX = expire after 24h&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`event:&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;eventId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;NX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;EX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;86400&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// null means the key already existed&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;isAlreadyProcessed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;// process...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step 5: Putting It All Together
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;express&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;crypto&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/webhook&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;

&lt;span class="c1"&gt;// --- Signature verification ---&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;verifySignature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createHmac&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sha256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expectedBuffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`sha256=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;expected&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sigBuffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;signature&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;expectedBuffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="nx"&gt;sigBuffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;timingSafeEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;expectedBuffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sigBuffer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// --- Queue ---&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WebhookQueue&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;concurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;maxRetries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;concurrency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;concurrency&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxRetries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;concurrency&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;shift&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;running&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handleEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;delay&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drain&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Dead letter: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WebhookQueue&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// --- Idempotency ---&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;processed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// --- Handler ---&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;processed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;processed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Processing event: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// your business logic here&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// --- Route ---&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/webhook&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-hub-signature-256&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nf"&gt;verifySignature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;WEBHOOK_SECRET&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OK&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// acknowledge immediately&lt;/span&gt;
  &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Webhook server listening on :3000&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  When to Upgrade to a Real Queue
&lt;/h2&gt;

&lt;p&gt;The in-process queue above is acceptable for &lt;strong&gt;a single persistent process with moderate throughput&lt;/strong&gt; — think a low-traffic internal tool or a side project where restarts are rare and you run one instance. You'll want to graduate to &lt;a href="https://bullmq.io" rel="noopener noreferrer"&gt;BullMQ&lt;/a&gt; (Redis-backed) or &lt;a href="https://aws.amazon.com/sqs/" rel="noopener noreferrer"&gt;AWS SQS&lt;/a&gt; when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're running &lt;strong&gt;multiple server instances&lt;/strong&gt; (in-process state won't be shared)&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;event history&lt;/strong&gt; and visibility into failed jobs&lt;/li&gt;
&lt;li&gt;Your event volume exceeds a few hundred per minute consistently&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;scheduled retries&lt;/strong&gt; that survive process restarts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The good news: the handler logic above (&lt;code&gt;handleEvent&lt;/code&gt;, idempotency, backoff) carries over directly. You're just swapping the queue substrate.&lt;/p&gt;

&lt;p&gt;Webhooks are one of those things that look simple until they aren't. Getting these five concerns right means you can receive events reliably at scale — without losing data, without duplicating side effects, and without taking down your server under a burst of retries.&lt;/p&gt;

&lt;p&gt;If you're building something that relies on real-time event delivery, these patterns are worth getting right from the start.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your webhook setup look like? Drop a comment — especially if you've found a gotcha I haven't covered.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>node</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Your Social Media Content Marketing is Failing. Here's Why</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Wed, 01 Apr 2026 11:30:00 +0000</pubDate>
      <link>https://dev.to/dumebii/your-launch-post-got-4-likes-your-product-deserved-better-hmb</link>
      <guid>https://dev.to/dumebii/your-launch-post-got-4-likes-your-product-deserved-better-hmb</guid>
      <description>&lt;p&gt;I will intro this article with my experience, but retold. &lt;/p&gt;

&lt;p&gt;You've spent six weeks building something real. You merged the final PR at 11pm on a Thursday. You pushed to production. You watched the deployment logs scroll clean. And then you did what every builder does: you opened Twitter, typed something like &lt;em&gt;"Just shipped [thing]. Super excited to share this with everyone 🚀"&lt;/em&gt;, hit post, and went to bed.&lt;/p&gt;

&lt;p&gt;You woke up to four likes. Two of them were your teammates.&lt;/p&gt;

&lt;p&gt;The product was solid. The problem it solved was real. But the post? The post was invisible.&lt;/p&gt;

&lt;p&gt;Here's the thing nobody tells you when you're deep in the build: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;shipping is only half the work.&lt;/strong&gt; &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The other half is making people care. And most technical founders, developers, and DevRel professionals are running that half on empty.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08fshagxk47lhwnzn2rx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F08fshagxk47lhwnzn2rx.png" alt="chat vs ozigi" width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Gap Between Building and Being Seen
&lt;/h2&gt;

&lt;p&gt;There's a particular kind of frustration that lives in technical communities. This is the frustration of people who are genuinely doing interesting things and can't seem to get traction on any of it.&lt;/p&gt;

&lt;p&gt;It's not imposter syndrome. It's just a distribution problem.&lt;/p&gt;

&lt;p&gt;The builders who get seen aren't always the ones building better things, sadly. They're just the ones better at translating what they build into content that lands. Content that makes someone stop mid-scroll and think "wait, this is exactly my problem," or "this is a painpoint I have."&lt;/p&gt;

&lt;p&gt;That translation layer is what most technical people skip, rush, or outsource badly.&lt;/p&gt;

&lt;p&gt;A &lt;a href="https://stateofdeveloperrelations.com" rel="noopener noreferrer"&gt;2024 State of DevRel report&lt;/a&gt; found that content creation consistently ranks as one of the top three time drains for developer advocates. This is not because they don't know what to write, but because the gap between "having something worth saying" and "saying it in a way that resonates" is a lot wider than most people expect.&lt;/p&gt;

&lt;p&gt;For founders, it's worse. You're building, selling, hiring, and doing customer calls, and somewhere in that schedule, you're supposed to be producing thought leadership content that grows your personal brand and drives top-of-funnel awareness. It rarely happens at the level it should.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Your Regular AI Doesn't Work
&lt;/h2&gt;

&lt;p&gt;The obvious answer is AI. You paste your notes into ChatGPT, ask it to write a LinkedIn post, and get something back that technically covers the topic. You post it but nothing happens-- no traction.&lt;/p&gt;

&lt;p&gt;It wasn't that the output was wrong. It was just generic. And generic content in technical communities doesn't just underperform, it actually actively damages credibility.&lt;/p&gt;

&lt;p&gt;Developers, content folks and DevRel professionals are some of the most discerning readers on the internet. They can spot templated, buzzword-heavy content in seconds. The moment a post opens with &lt;em&gt;"In today's fast-paced digital landscape"&lt;/em&gt; or promises to &lt;em&gt;"delve into the nuances"&lt;/em&gt; of anything, it's already dead on arrival.&lt;/p&gt;

&lt;p&gt;The problem isn't that AI tools can't write. It's just that most of them default to the statistical mean of their training data, which is saturated with corporate documentation, SEO copy, and marketing fluff. The output sounds like everybody. It sounds like nobody in particular.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkolocxej7ycug3jx36vq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkolocxej7ycug3jx36vq.png" alt="statiscal mean" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What is needed isn't just generated content. You need generated content that sounds like you. That is, content written with your specific technical depth, your actual voice, your real opinion.&lt;/p&gt;

&lt;p&gt;Tools like &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;Ozigi&lt;/a&gt; approach this differently. Instead of asking the AI to "write professionally" (a soft suggestion it ignores), Ozigi enforces a hard blocklist of AI-default vocabulary at the API level (words like &lt;em&gt;delve, robust, seamlessly, tapestry&lt;/em&gt; ) forcing the model to construct sentences from your actual content rather than padding with filler. The output reads less like a press release and more like a Slack message from someone who actually built the thing. You can read exactly how that system works in the &lt;a href="https://ozigi.app/docs/the-banned-lexicon" rel="noopener noreferrer"&gt;Banned Lexicon deep dive&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But the tool is only part of the answer. The bigger problem is structural.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Real Reason Your Content Isn't Working
&lt;/h2&gt;

&lt;p&gt;Most builders (like me, before) treat content like a release: something that happens once, at the end, when the thing is done.&lt;/p&gt;

&lt;p&gt;That mental model is the root cause of most distribution failure.&lt;/p&gt;

&lt;p&gt;Content that builds an audience doesn't work like product launches. It works like compounding interest. A single post doesn't build a following. A consistent body of work does. A consistent posting habit that over time signals to your audience that you're a reliable source of something worth reading.&lt;/p&gt;

&lt;p&gt;The builders who seem to "go viral" on X or LinkedIn aren't getting lucky. They've usually been shipping content consistently for long enough that when one post breaks through, there's a body of work behind it that converts interest into followers, followers into readers, and readers into users.&lt;/p&gt;

&lt;p&gt;So the real question isn't &lt;em&gt;"how do I write a better launch post?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It's &lt;em&gt;"how do I build a content system I can actually sustain?"&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What a Sustainable Technical Content System Looks Like
&lt;/h2&gt;

&lt;p&gt;Here's the framework. It's not complicated, but it requires treating content like an engineering problem — which, if you're reading this, is probably how you think best anyway.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Raw material is everywhere. Stop waiting for inspiration.
&lt;/h3&gt;

&lt;p&gt;Every week you're producing more content-worthy material than you realize:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PRs you merged and the decisions behind them&lt;/li&gt;
&lt;li&gt;A bug that took you three hours to track down&lt;/li&gt;
&lt;li&gt;A meeting where a customer said something that reframed how you think about the product&lt;/li&gt;
&lt;li&gt;A library you tried that didn't work the way the docs said it would&lt;/li&gt;
&lt;li&gt;An architectural decision you almost made and didn't&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of this requires you to sit down and think of something to write about. It requires you to notice that what's already happening in your work is interesting to other people.&lt;/p&gt;

&lt;p&gt;The shift is from treating content creation as a separate creative task to treating it as a documentation habit. You're already doing the work. You just need a system to capture it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;Ozigi&lt;/a&gt; is built around this principle. You drop in a URL, a block of raw notes, even a PDF, an audio, transcript, basically any piece of information you have at your disposal, and the engine extracts the narrative structure without you needing to summarize or clean it first. That's what the &lt;a href="https://ozigi.app/docs/multimodal-pipeline" rel="noopener noreferrer"&gt;multimodal ingestion pipeline&lt;/a&gt; is built to do: collapse the friction between "I have something worth saying" and "I have a draft worth editing" down to seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Platform matters more than most people think.
&lt;/h3&gt;

&lt;p&gt;A LinkedIn post and an X thread about the same topic are not the same content. They're different formats, different reader expectations, different hooks, different lengths.&lt;/p&gt;

&lt;p&gt;LinkedIn readers expect context and narrative. They'll read three paragraphs before deciding if they care. X readers decide in one sentence, often the first one. Discord announcements need to be skimmable. Newsletters can go long, but they need a reason to exist beyond "here's what I built."&lt;/p&gt;

&lt;p&gt;Most people write one thing and paste it across platforms unchanged. The format stays the same but engagement falls because the content doesn't match where it's landing.&lt;/p&gt;

&lt;p&gt;A proper content system produces platform-native output from the same source material. Your one insight: the rate-limiting decision, the architecture tradeoff, the customer discovery finding, etc, becomes a thread on X, a narrative on LinkedIn, a community update in Discord or Slack, and a newsletter deep-dive. Each piece formatted for the expectations of its audience, not copy-pasted from each other.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Your voice is the most important part of your content.
&lt;/h3&gt;

&lt;p&gt;Anyone can write about Next.js caching. Anyone can explain what a webhook is. But only you can explain those things with your specific perspective, your specific context, the way you'd describe it to a colleague over lunch.&lt;/p&gt;

&lt;p&gt;That voice — built over hundreds of posts — is what makes people follow &lt;em&gt;you&lt;/em&gt; and not just &lt;em&gt;the topic.&lt;/em&gt; It's what turns a reader into someone who shows up every time you post because they trust it'll be worth their time.&lt;/p&gt;

&lt;p&gt;That voice is also what AI strips out by default. The generic output problem isn't just an aesthetics issue. Every time you publish something that sounds like it came from a template, you're forfeiting the one thing that can't be replicated: the specific way you think about something.&lt;/p&gt;

&lt;p&gt;This is why &lt;a href="https://ozigi.app/docs/system-personas" rel="noopener noreferrer"&gt;Ozigi's System Personas&lt;/a&gt; go beyond setting a "tone." Instead of prompting "write professionally," you define a character: your technical depth, your sentence rhythm, the phrases you actually use, the things you'd never say. That brief gets applied to every generated content, which means every draft is already shaped like you before you touch the edit button.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The 10% rule: the tool gets you 90, you own the rest.
&lt;/h3&gt;

&lt;p&gt;The honest truth about AI-assisted content is that any decent engine can get you to 90% you need to get started. The last 10% is yours, and it's the part that actually matters.&lt;/p&gt;

&lt;p&gt;That 90% is structure, platform formatting, tone calibration, cutting the filler. Generative AI can handle that by default.&lt;/p&gt;

&lt;p&gt;The 10% is "the specific number from your metrics dashboard" or that inside joke the AI doesn't know about, the anecdote from your last customer call, or the offhand observation that only makes sense if you know your history with this problem. The exact phrasing you'd use if you were explaining this to a friend at 11pm.&lt;/p&gt;

&lt;p&gt;That 10% is what makes content trustworthy. It's what makes someone share it instead of just scrolling past it. And it's irreplaceable because it comes from actually having done the thing.&lt;/p&gt;

&lt;p&gt;The mistake most people make with AI writing tools is expecting the full 100%. When the output is 90% of the way there, they feel cheated. &lt;/p&gt;

&lt;p&gt;The better mental model to have is: &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;you're not outsourcing the writing. You're outsourcing the blank page.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Ozigi's editing layer is built around exactly this split. Every campaign lands in a staging area — nothing goes live until you've reviewed it. &lt;a href="https://ozigi.app/docs/human-in-the-loop" rel="noopener noreferrer"&gt;The human-in-the-loop architecture&lt;/a&gt; keeps generation and publishing strictly separate, so you're always the last step before your content reaches your audience.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs2c8ajezdod67inkwfng.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fs2c8ajezdod67inkwfng.png" alt="ozigi's edit area" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Compounding Effect Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Here's what happens when you run a consistent content system for six months:&lt;/p&gt;

&lt;p&gt;Your posts start referencing each other. Your audience starts anticipating what you'll say next. When you ship something new, you have enough readers that the launch post gets signal on day one, which means it gets distributed further, which means more people see it.&lt;/p&gt;

&lt;p&gt;So, getting four likes on your launch post isn't a content quality problem. The problem is a lack of consistency. What it looks like is you posting into a vacuum because you hadn't been posting consistently enough to have an audience ready when it mattered.&lt;/p&gt;

&lt;p&gt;The builders who seem to "have an audience already" when they ship something new didn't get lucky. &lt;br&gt;
I know a founder on X who did a 100 day post on X challange before his product launch. He climbed to $500 in sales in the first week. He already had an audience.&lt;br&gt;
He paid the consistency debt early. He posted about the messy in-progress version, the failed experiments, the decisions he made and unmade. By the time he shipped, the audience was already there.&lt;/p&gt;

&lt;p&gt;Content marketing for technical audiences is a long game. The best time to start was six months ago. The second-best time is right now with a system that makes it sustainable enough to actually keep going.&lt;/p&gt;




&lt;h2&gt;
  
  
  Start Small. Ship Consistently.
&lt;/h2&gt;

&lt;p&gt;You don't need to produce ten pieces of content a week. You don't need a content calendar with color-coded categories and quarterly themes.&lt;/p&gt;

&lt;p&gt;You need one piece of content per week that comes from something you actually did, written in a voice that sounds like you, distributed to the platforms where your audience actually is.&lt;/p&gt;

&lt;p&gt;That's the whole system.&lt;/p&gt;

&lt;p&gt;The tools exist to make it easier. The only thing without a shortcut is starting.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;If you're a technical founder, developer, or DevRel professional trying to build a consistent content presence without it eating your calendar — &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;Ozigi&lt;/a&gt; is worth trying.&lt;/strong&gt; The free tier gives you 5 campaigns a month. Drop in your raw notes from last week, see what comes out, and decide from there. Get one week of Pro free when you sign up today!&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;Try Ozigi free&lt;/a&gt; · &lt;a href="https://ozigi.app/docs" rel="noopener noreferrer"&gt;Read the platform docs&lt;/a&gt; · &lt;a href="https://ozigi.app/docs/deep-dives" rel="noopener noreferrer"&gt;See the architecture deep dives&lt;/a&gt; · &lt;a href="https://github.com/Ozigi-app/OziGi" rel="noopener noreferrer"&gt;Star on GitHub&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have a content system that's actually working for you? Or a launch post that flopped spectacularly and taught you something? Drop it in the comments — genuinely curious what patterns people are seeing.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>devrel</category>
      <category>contentwriting</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Gemini 2.5 Flash vs Claude 3.7 Sonnet: 4 Production Constraints That Made the Decision for Me</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Tue, 10 Mar 2026 13:00:34 +0000</pubDate>
      <link>https://dev.to/dumebii/gemini-25-flash-vs-claude-37-sonnet-4-production-constraints-that-made-the-decision-for-me-bib</link>
      <guid>https://dev.to/dumebii/gemini-25-flash-vs-claude-37-sonnet-4-production-constraints-that-made-the-decision-for-me-bib</guid>
      <description>&lt;p&gt;An evaluation of the Gemini 2.5 flash and Claude 3.7 Sonnet model for an agentic engine.&lt;/p&gt;

&lt;p&gt;I had a simple rule when choosing an LLM for &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;Ozigi&lt;/a&gt;: don't pick based on benchmark leaderboards. After my v2 launch, in recieving feedback, a user suggested I use the Claude models as they were better for content generation than Gemini. While the suggestion sounded tempting, I had to pick a model based on the four constraints my production pipeline couldn't negotiate around.&lt;/p&gt;

&lt;p&gt;Most "Gemini vs Claude" comparisons evaluate general-purpose capabilities like coding, reasoning, and creative writing. That's useful if you're building a general-purpose product. &lt;br&gt;
I wasn't. &lt;br&gt;
Ozigi is a content engine. You feed it a URL, a PDF, or raw notes. It returns a structured 3-day social media campaign as a JSON payload that the frontend maps directly into UI cards.&lt;/p&gt;

&lt;p&gt;That specificity made the evaluation easier than I expected: Two models, Four constraints. One clear winner on three of the constraints.&lt;/p&gt;

&lt;p&gt;This is the third post in the &lt;a href="https://dev.to/dumebii/series/36170"&gt;Ozigi Changelog Series&lt;/a&gt;. If you want the backstory on why Ozigi exists, start with &lt;a href="https://dev.to/dumebii/i-vibe-coded-an-internal-tool-that-slashed-my-content-workflow-by-4-hours-310f"&gt;how I vibe-coded the internal tool&lt;/a&gt; that became it, and the &lt;a href="https://dev.to/dumebii/ozigi-v2-changelog-building-a-modular-agentic-content-engine-with-nextjs-supabase-and-playwright-59mo"&gt;v2 changelog&lt;/a&gt; that introduced the modular architecture this decision was built on.&lt;/p&gt;

&lt;p&gt;Here's the full Architecture Decision Record.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Setup: What the Pipeline Actually Does
&lt;/h2&gt;

&lt;p&gt;The core API route in Ozigi does this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Accepts a &lt;code&gt;multipart/form-data&lt;/code&gt; payload containing a URL, raw text, and/or a file (PDF or image)&lt;/li&gt;
&lt;li&gt;Constructs a prompt with strict editorial constraints injected at the system level&lt;/li&gt;
&lt;li&gt;Sends everything to the LLM via the &lt;a href="https://cloud.google.com/vertex-ai/docs/start/client-libraries" rel="noopener noreferrer"&gt;Vertex AI Node.js SDK&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Returns the raw text response directly to the client&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The frontend then does this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;responseText&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nf"&gt;setCampaign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;campaign&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No middleware. No schema validation. No error recovery in the happy path. Raw parse, straight into React state.&lt;/p&gt;

&lt;p&gt;That single line is why model selection mattered.&lt;/p&gt;




&lt;h2&gt;
  
  
  Constraint 1: Comparing Gemini vs Claude Models for JSON Output Stability
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The requirement:&lt;/strong&gt; The model must return a valid JSON object — every time, without wrapping it in markdown code fences, without adding a conversational preamble, and without hallucinating a trailing comma that breaks &lt;code&gt;JSON.parse()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The target schema looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"campaign"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"day"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"linkedin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"discord"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"day"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"linkedin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"discord"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"day"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"x"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"linkedin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"discord"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It renders nine posts across three platforms in a span of three days, with every field required. &lt;br&gt;
The UI renders each field into a separate card with edit, copy, and publish actions. A missing key doesn't throw a visible error — it silently renders an empty card.&lt;br&gt;
This comparison is specifically between Gemini with &lt;code&gt;responseSchema&lt;/code&gt; enforcement and Claude with prompted JSON, not between each model's structural output ceiling. Claude's tool use with &lt;code&gt;tool_choice: {type: "tool"}&lt;/code&gt; enforces schema at the decoding layer and can reach equivalent reliability. The relevant constraint here was which enforcement mechanism was available and practical within my existing stack. More on that below.&lt;br&gt;
I ran 500 automated test generations against both models targeting this schema, measuring the percentage of responses that &lt;code&gt;JSON.parse()&lt;/code&gt; accepted without exceptions.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Format Adherence Rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;
&lt;td&gt;99.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude 3.7 Sonnet (prompted)&lt;/td&gt;
&lt;td&gt;~88.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5sbgjoan2io2r0usee0f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5sbgjoan2io2r0usee0f.png" alt="Bar chart: Gemini 2.5 Flash 99.9% vs Claude 3.7 Sonnet 88.5% JSON parse success rate across 500 test generations." width="800" height="419"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The 11.5% gap maps directly to broken UI states for real users. That was not acceptable to me for a core feature.&lt;/p&gt;

&lt;p&gt;Using Gemini's &lt;code&gt;responseSchema&lt;/code&gt; closes this entirely. According to &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output" rel="noopener noreferrer"&gt;Google's controlled generation documentation&lt;/a&gt;, the feature physically prevents the model from returning output that doesn't conform to your schema. It's not prompt-level guidance, it's enforced at the decoding layer. Here's what the production implementation looks like for Ozigi: the schema is defined once at the top of the route and attached directly to the model config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;distributionSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OBJECT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;campaign&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ARRAY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;A list of 3 daily social media posts.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;OBJECT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;day&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;INTEGER&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Day number (1, 2, or 3)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;STRING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content for X/Twitter.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="na"&gt;linkedin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;STRING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content for LinkedIn.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="na"&gt;discord&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;STRING&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;  &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content for Discord.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;day&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;x&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;linkedin&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;discord&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;campaign&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;vertex_ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGenerativeModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;generationConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;responseMimeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;responseSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;distributionSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;response.text()&lt;/code&gt; is now structurally guaranteed to be valid JSON. &lt;code&gt;JSON.parse()&lt;/code&gt; cannot fail on a missing field, trailing comma, or conversational preamble — the model is physically prevented from producing them. &lt;br&gt;
Claude's tool use and function calling can achieve similar guarantees, but it requires a meaningfully different integration architecture. With the Vertex SDK, this is one config block.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Winner: Gemini.&lt;/strong&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Constraint 2: Comparing Gemini vs Claude on Latency on a Live Public Sandbox
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The requirement:&lt;/strong&gt; Ozigi has a free, unauthenticated sandbox. Anyone can generate a full 3-day campaign without signing up.&lt;/p&gt;

&lt;p&gt;That changes the economics of model selection completely. A paying user on a premium plan will tolerate a 20-second wait if the output quality justifies it. An anonymous user who found the product via my whacky marketing efforts will not. They'll close the tab at 10 seconds and probably not come back, sadly.&lt;/p&gt;

&lt;p&gt;I benchmarked both models against a standard 10,000-token input payload via Vercel serverless functions (my production environment):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Avg Response Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;
&lt;td&gt;~6.2s&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude 3.7 Sonnet&lt;/td&gt;
&lt;td&gt;~21.5s&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd88o8by58f78rzbzxqdl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd88o8by58f78rzbzxqdl.png" alt="Bar chart: Gemini 2.5 Flash 6.2s vs Claude 3.7 Sonnet 21.5s average response latency from Vercel serverless, with 10s tab-close threshold marked" width="800" height="522"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Methodology: N=100 requests per model, measured end-to-end from Vercel function invocation to full response. Results are environment-dependent and intended for directional comparison, not as absolute benchmarks.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The gap holds across payload sizes. Gemini Flash consistently comes in under 10-15 seconds. Claude 3.7 Sonnet consistently exceeds 20 seconds on the same inputs, in the same environment.&lt;/p&gt;

&lt;p&gt;This gap would narrow significantly with streaming: getting first tokens in front of the user within 2-3 seconds. Streaming changes the perceived wait time for a user entirely. This is, however, a v4 architecture item that is being worked on. For a non-streaming pipeline with a public sandbox, the 3.5x latency difference is a product decision, not just an engineering one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Winner: Gemini Flash&lt;/strong&gt; — and it's not close for non-streaming public sandboxes.&lt;/p&gt;


&lt;h2&gt;
  
  
  Constraint 3: Comparing Gemini vs Claude on Native Multimodal Ingestion
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The requirement:&lt;/strong&gt; Users can upload PDFs and images directly as context. The pipeline needs to process them without an external preprocessing step.&lt;/p&gt;

&lt;p&gt;With Gemini via the &lt;a href="https://cloud.google.com/vertex-ai/docs/start/client-libraries" rel="noopener noreferrer"&gt;Vertex AI Node.js SDK&lt;/a&gt;, the entire PDF pipeline is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// /app/api/generate/route.ts&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;size&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;arrayBuffer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arrayBuffer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;base64Data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;arrayBuffer&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;base64&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;inlineData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;base64Data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;mimeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="kd"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// "application/pdf", "image/jpeg", etc.&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateContent&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;parts&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see that the SDK handles the buffer natively. Gemini reads the PDF directly as part of the multipart request alongside the text prompt — no OCR step, no preprocessing, no separate service call. &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview" rel="noopener noreferrer"&gt;Google's multimodal documentation&lt;/a&gt; confirms that Gemini was designed from the ground up to handle PDF and image buffers natively via &lt;code&gt;inlineData&lt;/code&gt;.&lt;/p&gt;




&lt;p&gt;An earlier version of this article claimed that Claude required an external OCR step for PDF ingestion. That was wrong. Claude's Messages API does support native base64 PDF ingestion directly via a document content block — no OCR preprocessing, no external service. The pattern is structurally similar to Vertex AI's inlineData, just different field names.&lt;br&gt;
The real constraint here was ecosystem, not capability. I evaluated Claude 3.7 Sonnet as available in the Google Model Garden within my existing Vertex AI setup. Switching to Claude's native PDF ingestion would have meant moving to the Anthropic Messages API entirely — a different provider, different SDK, different billing. The Vertex AI path was simpler for the stack I was already running.&lt;br&gt;
Winner: Gemini — for this stack. Both models support native multimodal ingestion without external OCR. The advantage here was ecosystem fit, not a fundamental capability difference.&lt;/p&gt;


&lt;h2&gt;
  
  
  Constraint 4: Comparing Google Gemini vs Claude on Tone Engineering
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The requirement:&lt;/strong&gt; Generated social media posts must sound like a human wrote them. Specifically, they must pass AI content detection and avoid the predictable cadence patterns that make AI-generated copy immediately identifiable.&lt;/p&gt;

&lt;p&gt;This is the constraint where Claude wins cleanly on base performance. &lt;br&gt;
Our internal blind A/B evaluations of 50 technical posts (scored on pragmatic sentence structure and absence of AI terminology) gave Claude 3.7 Sonnet a "human cadence quality score" of 9.5/10. Gemini Flash's base score was 5.5/10.&lt;/p&gt;

&lt;p&gt;That's a significant gap. And it's for the feature that is Ozigi's core value proposition.&lt;/p&gt;
&lt;h3&gt;
  
  
  Why use Gemini for Tone Engineering?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Because the gap is engineerable.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We built the Banned Lexicon — a programmatic constraint injected at the system prompt level that explicitly penalizes the vocabulary patterns that make AI copy detectable. You can read the full implementation in the &lt;a href="https://ozigi.app/docs" rel="noopener noreferrer"&gt;Ozigi documentation&lt;/a&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;THE BANNED LEXICON: You are strictly forbidden from using the 
following words or their variations: delve, testament, tapestry, 
crucial, vital, landscape, realm, unlock, supercharge, revolutionize, 
paradigm, seamlessly, navigate, robust, cutting-edge, game-changer.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combined with explicit cadence engineering:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BURSTINESS (CADENCE): Write with high burstiness. Do not use 
perfectly balanced, medium-length sentences. Mix extremely short, 
punchy sentences (2-4 words) with longer, detailed explanations.

PERPLEXITY: Avoid predictable adjectives. Use strong, active verbs 
and concrete nouns. Talk like a pragmatic subject matter expert 
explaining a concept to people, not a marketer selling a product.

FORMATTING RESTRAINT: You are limited to a MAXIMUM of 1 emoji per 
post. Use a maximum of 2 highly relevant hashtags per post.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With these constraints active, Gemini's human cadence score jumps from 5.5 to 9.2 — within acceptable range of Claude's base 9.5.&lt;/p&gt;

&lt;p&gt;The key insight: Claude's tone advantage is a &lt;em&gt;default&lt;/em&gt; advantage, not an &lt;em&gt;absolute&lt;/em&gt; one. Gemini's outputs are more malleable under prompt constraints. For a use case where tone control is the entire product, that malleability is worth more than a higher baseline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Winner: Gemini + engineering constraints.&lt;/strong&gt; The tone gap is closeable. The latency and JSON stability gaps on the other constraints are not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F15wn2uuvacy1ws7bldib.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F15wn2uuvacy1ws7bldib.png" alt="Horizontal bar chart: Gemini base 5.5/10 vs Gemini with Banned Lexicon 9.2/10 vs Claude base 9.5/10 human cadence score." width="800" height="571"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Gemini vs Claude Models: The Cost Reality
&lt;/h2&gt;

&lt;p&gt;At this stage where Ozigi is a public sandbox, every anonymous page load that can trigger a generation is a billable API call absorbed by the product. Ozigi is at its pre-revenue stage, so this matters a lot.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input Cost (per 1M tokens)&lt;/th&gt;
&lt;th&gt;Output Cost (per 1M tokens)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;
&lt;td&gt;~$0.075&lt;/td&gt;
&lt;td&gt;~$0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude 3.7 Sonnet&lt;/td&gt;
&lt;td&gt;~$3.00&lt;/td&gt;
&lt;td&gt;~$15.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxydm83hlu2nh0r7t4ny.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxydm83hlu2nh0r7t4ny.png" alt="Cost comparison: Gemini $0.075 input / $0.30 output vs Claude $3.00 input / $15.00 output per 1M tokens. 40x to 50x difference." width="800" height="671"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Pricing sourced from &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/pricing" rel="noopener noreferrer"&gt;Google Cloud Vertex AI pricing&lt;/a&gt; and &lt;a href="https://platform.claude.com/docs/en/about-claude/pricing" rel="noopener noreferrer"&gt;Anthropic API pricing&lt;/a&gt;. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pro tip:Verify current rates before production decisions — both have changed multiple times in the past year.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The input cost difference is 40x. The output cost difference is 50x. For a free-tier product with no revenue, the ability to run a public sandbox sustainably is the difference between having a conversion funnel and not having one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where Ozigi is Going and How it'd Change My Choice of Model, Moving Foward
&lt;/h2&gt;

&lt;p&gt;This is an honest &lt;a href="https://adr.github.io/" rel="noopener noreferrer"&gt;ADR&lt;/a&gt;. Here's what would change my answer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When Ozigi finally moves behind a paywall&lt;/strong&gt;, latency and cost become secondary concerns. A signed-in user on a paid plan is more likely waiting 20 seconds for premium output is a different UX calculation than an anonymous user on a free demo. In that context, Claude's base tone quality becomes much more compelling. I'd be trading economics for output baseline, and the trade might be worth it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When streaming gets implemented&lt;/strong&gt;, the latency argument against Claude weakens significantly. Claude 3.7 Sonnet's time-to-first-token via streaming is competitive. A user seeing the first post appear in 2-3 seconds experiences the product very differently than a user staring at a progress bar for 21 seconds. Streaming is on the roadmap.&lt;/p&gt;

&lt;p&gt;For an in-depth look at how we tested the pipeline that informs these decisions, see &lt;a href="https://dev.to/dumebii/how-to-e2e-test-ai-agents-mocking-api-responses-with-playwright-in-nextjs-nic"&gt;how we E2E test AI agents with Playwright in Next.js&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Decision Matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Constraint&lt;/th&gt;
&lt;th&gt;Gemini 2.5 Flash&lt;/th&gt;
&lt;th&gt;Claude 3.7 Sonnet&lt;/th&gt;
&lt;th&gt;Winner&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON Stability (responseSchema)&lt;/td&gt;
&lt;td&gt;99.9% → guaranteed&lt;/td&gt;
&lt;td&gt;~88.5% (prompted)&lt;/td&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Latency (non-streaming)&lt;/td&gt;
&lt;td&gt;~6.2s&lt;/td&gt;
&lt;td&gt;~21.5s&lt;/td&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Native PDF/Image ingestion&lt;/td&gt;
&lt;td&gt;Native via Vertex SDK&lt;/td&gt;
&lt;td&gt;Native via Messages API&lt;/td&gt;
&lt;td&gt;Gemini (Eco-system fit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Base tone quality&lt;/td&gt;
&lt;td&gt;5.5/10&lt;/td&gt;
&lt;td&gt;9.5/10&lt;/td&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tone quality (+ constraints)&lt;/td&gt;
&lt;td&gt;9.2/10&lt;/td&gt;
&lt;td&gt;9.5/10&lt;/td&gt;
&lt;td&gt;Near tie&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost per 1M input tokens&lt;/td&gt;
&lt;td&gt;$0.075&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;Gemini&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Gemini won on five of six dimensions. Claude won on one — base tone — and that gap was closeable through prompt engineering.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four Questions To Ask Before Choosing An LLM Model For Your Agentic Project/APP
&lt;/h2&gt;

&lt;p&gt;If you're building something similar to Ozigi, these are the constraints worth looking through before you pick an API and start building:&lt;/p&gt;

&lt;p&gt;**1. Does your UI depend on structured output? If your frontend calls &lt;code&gt;JSON.parse()&lt;/code&gt; on a raw model response, you need API-level schema enforcement — not prompt instructions asking nicely. &lt;a href="https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/control-generated-output" rel="noopener noreferrer"&gt;&lt;code&gt;responseSchema&lt;/code&gt; via Vertex AI&lt;/a&gt;, Claude's tool use with forced &lt;code&gt;tool_choice&lt;/code&gt;, or &lt;a href="https://platform.openai.com/docs/guides/structured-outputs" rel="noopener noreferrer"&gt;structured outputs via OpenAI&lt;/a&gt; all enforce at the decoding layer. The question isn't which model supports it — most do — it's which enforcement path fits your existing stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Do you have a free tier or public sandbox?&lt;/strong&gt; If yes, latency and cost are product decisions that affect conversion, not just infrastructure decisions that affect margins.&lt;/p&gt;

&lt;p&gt;**3. Does your use case require multimodal inputs? Most major models now support native PDF and image ingestion without external preprocessing. Map out what the integration looks like within your existing API provider before assuming you need to switch or add infrastructure&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Where is the base model weakest, and is that gap engineerable?&lt;/strong&gt; Claude's tone advantage is real. It's also not the only path to human-sounding copy. Engineering constraints at the prompt level can close gaps that feel insurmountable when you're just looking at base benchmarks.&lt;/p&gt;

&lt;p&gt;The best model for your product is rarely the one with the highest aggregate score. It's the one that fails least on the constraints you actually can't work around.&lt;/p&gt;




&lt;ul&gt;
&lt;li&gt;The full Ozigi architecture — including the generate API route, the Banned Lexicon implementation, and the Vertex AI configuration — is open source on &lt;a href="https://github.com/Dumebii/OziGi" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. &lt;/li&gt;
&lt;li&gt;The live context engine is at &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;ozigi.app&lt;/a&gt;. &lt;/li&gt;
&lt;li&gt;The interactive version of this ADR &lt;a href="https://ozigi.app/architecture" rel="noopener noreferrer"&gt;with Chart.js visualisations of each benchmark&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Ozigi is currently looking for User Experience Testers to give honest Feedback on their experience using the product, and areas for improvement.&lt;/li&gt;
&lt;li&gt;We have some &lt;a href="https://github.com/Dumebii/OziGi/issues" rel="noopener noreferrer"&gt;open issues&lt;/a&gt; on Github that is welcome to contribution from the community. 
&lt;em&gt;ps, this app has been entirely vibe coded so far, therefore we welcome vibe coded contributions too!&lt;/em&gt; &lt;/li&gt;
&lt;li&gt;Connect With Me On &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Send me an email on &lt;a href="mailto:okolodumebi@gmail.com"&gt;okolodumebi@gmail.com&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Building osmething cool? Talk about it in the comments!&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>showdev</category>
      <category>nextjs</category>
    </item>
    <item>
      <title>How to End-to-end (E2E) Test AI Agents: Mocking API Responses with Playwright in Next.js</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Fri, 06 Mar 2026 12:50:33 +0000</pubDate>
      <link>https://dev.to/dumebii/how-to-e2e-test-ai-agents-mocking-api-responses-with-playwright-in-nextjs-nic</link>
      <guid>https://dev.to/dumebii/how-to-e2e-test-ai-agents-mocking-api-responses-with-playwright-in-nextjs-nic</guid>
      <description>&lt;p&gt;Building an AI agent is fun. At least, I have had so much fun building out &lt;a href="//ozigi.app"&gt;Ozigi&lt;/a&gt;, a social media content manager agent (ps, we are in need of user experience testers!).&lt;/p&gt;

&lt;p&gt;But!&lt;br&gt;
Testing it in a CI/CD pipeline is a nightmare.&lt;/p&gt;

&lt;p&gt;If you are building an application that relies on an LLM (like OpenAI, Anthropic, or Google's Vertex AI), you quickly run into these three challanges when writing End-to-End (E2E) tests:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; Every time your test suite runs, you are burning API credits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed:&lt;/strong&gt; LLMs are slow. Waiting 10-15 seconds per test will grind your deployment pipeline to a halt.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-Determinism:&lt;/strong&gt; LLMs never return the &lt;em&gt;exact&lt;/em&gt; same string twice. If your Playwright test relies on &lt;code&gt;expect(page.getByText('exact phrase')).toBeVisible()&lt;/code&gt;, your tests will randomly fail.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;While building &lt;a href="//ozigi.app"&gt;Ozigi&lt;/a&gt;—an agentic content engine designed to turn raw technical research into structured social campaigns—I needed a way to test the complex UI state transitions (like custom loaders and dynamic grids) without actually hitting the Vertex AI API, especially seeing as I am managing very conservatively my $300 in credits!&lt;/p&gt;
&lt;h2&gt;
  
  
  Playwright Network Interception
&lt;/h2&gt;

&lt;p&gt;Here is how to completely decouple your frontend E2E tests from your LLM backend using Next.js and Playwright.&lt;/p&gt;

&lt;p&gt;In Ozigi, the user flow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The user selects a custom persona and inputs raw context (a URL or text dump).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kvlkn8yd38avujkcbub.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0kvlkn8yd38avujkcbub.png" alt="create persona" width="800" height="642"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;They click "Generate Campaign."&lt;/li&gt;
&lt;li&gt;The UI swaps to a &lt;code&gt;&amp;lt;DynamicLoader /&amp;gt;&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxm0lgwr41z9esjp8qq7a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxm0lgwr41z9esjp8qq7a.png" alt="dynamic loader" width="800" height="399"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Next.js API route (&lt;code&gt;/api/generate&lt;/code&gt;) sends the context to Gemini 2.5 Pro.&lt;/li&gt;
&lt;li&gt;The LLM returns a strictly formatted JSON object.&lt;/li&gt;
&lt;li&gt;The UI renders the multi-platform campaign grid.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft68x94rga0zppqyomm40.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft68x94rga0zppqyomm40.png" alt="distribution grid" width="800" height="406"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If I test this live, it will introduce latency and flakiness. &lt;br&gt;
Instead, I intercepted the API call and instantly return a fake JSON payload.&lt;/p&gt;
&lt;h2&gt;
  
  
  Network Mocking (Interception with &lt;code&gt;page.route&lt;/code&gt;)
&lt;/h2&gt;

&lt;p&gt;Playwright allows us to hijack outbound network requests directly from the browser. When the frontend tries to call our Next.js API route, Playwright intercepts the &lt;code&gt;POST&lt;/code&gt; request, blocks it from ever hitting the server, and fulfills it with our own static data.&lt;/p&gt;

&lt;p&gt;Here is the exact test script I use to validate the Ozigi content engine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@playwright/test&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;test&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Ozigi Context Engine &amp;amp; AI Mocking&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

  &lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;should generate a campaign by intercepting the LLM response&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. Navigate to the dashboard&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/dashboard&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Fill out the Context fields&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByPlaceholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Paste a URL or raw notes&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://ozigi.app/docs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByPlaceholder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Additional directives...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Keep it technical.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// 🚀 THE MAGIC: Intercept the AI generation API route&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;**/api/generate&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;

      &lt;span class="c1"&gt;// Define the exact JSON structure your frontend expects from the LLM&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;mockedAIResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;campaign&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;day&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Day 1 Thread: Ozigi is tested and working! 1/2&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;[The content engine is officially alive.]&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;linkedin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;LinkedIn Post: Ozigi testing complete.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;discord&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Discord Update: Systems green.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;

      &lt;span class="c1"&gt;// Fulfill the route instantly with the mocked data&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;route&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fulfill&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;contentType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mockedAIResponse&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Trigger the generation&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sr"&gt;/Generate Campaign/i&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// 4. Assert the UI state transitions correctly&lt;/span&gt;
    &lt;span class="c1"&gt;// Verify the loader appears while the "network" request is happening&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;loaderContainer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;locator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.animate-in.fade-in&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;loaderContainer&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// 5. Assert the final UI renders our mocked data perfectly&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Ozigi is tested and working!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getByText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[The content engine is officially alive.]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBeVisible&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why You Should Mock LLM/API Responses In Playwright
&lt;/h2&gt;

&lt;p&gt;By using this testing pattern, I achieved three of my engineering goals:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Zero Cost:&lt;/strong&gt; The test suite can run 1,000 times a day on GitHub Actions without costing a single cent in Vertex AI compute.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lightning Fast:&lt;/strong&gt; The entire E2E test finishes in seconds, as I bypass the LLM's generation latency entirely.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Absolute Determinism:&lt;/strong&gt; Because I injected a static JSON payload, my text assertions (&lt;code&gt;toBeVisible&lt;/code&gt;) will never fail due to an AI hallucination or a slightly altered adjective.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When building AI wrappers or agentic workflows, your testing strategy must isolate the LLM from the UI. Let the LLM be unpredictable in production, but demand strict predictability in your test suite.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I built this network mocking (interception0 pattern into &lt;a href="//ozigi.app"&gt;Ozigi&lt;/a&gt;, an agentic content engine that helps pretty much anyone turn their raw notes/ideas into structured, multi-platform campaigns without dealing with cheesy AI buzzwords. You can check it out at &lt;a href="https://ozigi.app" rel="noopener noreferrer"&gt;ozigi.app&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Let's connect on &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn&lt;/a&gt;!&lt;br&gt;
You can find my spaghetti code &lt;a href="https://github.com/Dumebii/OziGi" rel="noopener noreferrer"&gt;here.&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Consider this the unofficial v3 changelog of Ozigi. As always, we are welcome to your feedback and can't wait to hear from you!&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>webdev</category>
      <category>playwright</category>
      <category>nextjs</category>
      <category>api</category>
    </item>
    <item>
      <title>Ozigi v2 Changelog: Building a Modular Agentic Content Engine with Next.js, Supabase, and Playwright</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Mon, 02 Mar 2026 11:37:31 +0000</pubDate>
      <link>https://dev.to/dumebii/ozigi-v2-changelog-building-a-modular-agentic-content-engine-with-nextjs-supabase-and-playwright-59mo</link>
      <guid>https://dev.to/dumebii/ozigi-v2-changelog-building-a-modular-agentic-content-engine-with-nextjs-supabase-and-playwright-59mo</guid>
      <description>&lt;p&gt;When I first built &lt;a href="https://blogger-helper-tau.vercel.app/" rel="noopener noreferrer"&gt;Ozigi&lt;/a&gt; (initially WriterHelper), the goal was simple: give content professionals in my team a way to break down their articles into high-signal social media campaigns.&lt;/p&gt;

&lt;p&gt;OziGi has now evolved to an open source SaaS product, oepn to the public to use and imnprove.&lt;/p&gt;

&lt;p&gt;Here is the complete technical changelog of how I completely turned Ozigi from a monolithic v1 MVP into a production-ready v2 SaaS.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Modular Refactoring of The App.tsx (Separation of Concerns)
&lt;/h2&gt;

&lt;p&gt;In v1, my entire application: auth, API calls, and UI—lived inside a long &lt;code&gt;app/page.tsx&lt;/code&gt; file. The more changes I made, the harder it became to manage.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Modular Component Library:&lt;/strong&gt; I stripped down the monolith and broke the UI into pure, single-responsibility React components (&lt;code&gt;Header&lt;/code&gt;, &lt;code&gt;Hero&lt;/code&gt;, &lt;code&gt;Distillery&lt;/code&gt;, etc.).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96cypkydgo446zrjcfwt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F96cypkydgo446zrjcfwt.png" alt="modular architecture" width="800" height="902"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Centralized Type Safety:&lt;/strong&gt; I created a global &lt;code&gt;lib/types.ts&lt;/code&gt; file with a strict &lt;code&gt;CampaignDay&lt;/code&gt; interface (complete with index signatures) to finally eliminate the TypeScript "shadow type" build errors I was fighting.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State Persistence:&lt;/strong&gt; Implemented &lt;code&gt;localStorage&lt;/code&gt; syncing so the app "remembers" if a user is in the dashboard or the landing page, preventing frustrating resets on browser refresh.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. Using Supabase as the Database and Tightening the Backend
&lt;/h2&gt;

&lt;p&gt;A major UX flaw in v1 was that refreshing the page wiped the user's progress.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Relational Database &amp;amp; OAuth:&lt;/strong&gt; I replaced anonymous access with secure GitHub OAuth via Supabase.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated Context History:&lt;/strong&gt; I engineered a system that auto-saves every generated campaign to a PostgreSQL database. Users can now restore past URLs, notes, and outputs with a single click.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fodnvsgusnc26sy44u8sw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fodnvsgusnc26sy44u8sw.png" alt="strategy history" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Identity Storage:&lt;/strong&gt; Built a settings flow to permanently save a user's custom "Persona Voice" and Discord Webhook URLs directly to their profile.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8b09ue6myvb4a0jxhzx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj8b09ue6myvb4a0jxhzx.png" alt="discord webhook upload and added context" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Core Feature Additions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Modal Ingestion:&lt;/strong&gt; Upgraded the input engine to accept both a live URL &lt;em&gt;and&lt;/em&gt; raw custom text simultaneously.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkdo3urlx7xmhx3pu9th.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flkdo3urlx7xmhx3pu9th.png" alt="context engine dashboard" width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Native Discord Deployment:&lt;/strong&gt; Built a dedicated API route and UI webhook integration to push generated content directly to Discord servers with one click.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Update UI/UX &amp;amp; Professional Branding
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Rebrand:&lt;/strong&gt; Pivoted the app's messaging to focus entirely on content professionals, positioning it as an engine to generate social media content with ease and in your own voice.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Open-First Onboarding:&lt;/strong&gt; Designed a "Try Before You Buy" workflow. Unauthenticated users can test the AI generation seamlessly, but are gated from premium features (History, Personas, Discord) via an Upgrade Banner.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foyaxoafb4dsuh3dqtt89.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foyaxoafb4dsuh3dqtt89.png" alt="guest mode" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pixel-Perfect Layouts &amp;amp; SEO:&lt;/strong&gt; Eliminated rogue whitespace and &lt;code&gt;z-index&lt;/code&gt; issues using precise CSS Flexbox rules. Upgraded &lt;code&gt;app/layout.tsx&lt;/code&gt; with professional OpenGraph and Twitter Card metadata.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhqbf31p44p1p5e5sjyj4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhqbf31p44p1p5e5sjyj4.png" alt="ozigi homepage" width="800" height="466"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Quality Assurance &amp;amp; DevOps (Automated Playwright E2E Tests)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Automated E2E Testing:&lt;/strong&gt; Completely rewrote the Playwright test suite (&lt;code&gt;engine.spec.ts&lt;/code&gt;) to verify the new landing page copy, test the navigation flow, and confirm security rules apply correctly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Linux Dependency Fixes:&lt;/strong&gt; Patched my CI/CD pipeline by ensuring underlying Linux browser dependencies (&lt;code&gt;--with-deps&lt;/code&gt;) are installed so headless Chromium tests pass flawlessly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next? (v3 Roadmap)
&lt;/h2&gt;

&lt;p&gt;With the Context Engine now stable, the foundation is set. &lt;br&gt;
My plan for V3 is to fix the deployment pipeline: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;integrating the native X (Twitter) &lt;/li&gt;
&lt;li&gt;LinkedIn APIs so users can publish directly from the Ozigi dashboard.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;What has been your biggest challenge scaling a Next.js MVP? Let me know in the comments!&lt;/em&gt;&lt;br&gt;
Try out &lt;a href="https://blogger-helper-tau.vercel.app/" rel="noopener noreferrer"&gt;Ozigi&lt;/a&gt;&lt;br&gt;
And let me know if you have any feature suggestions? Let me know!&lt;br&gt;
Want to see my poorly written code? Find &lt;a href="https://github.com/Dumebii/OziGi" rel="noopener noreferrer"&gt;OziGi on Github.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Connect with me on &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;What came next:&lt;br&gt;
After shipping v2, the next hard question was model selection. A reader suggested switching to Claude for better content quality. I ran the benchmarks instead of just taking the advice. The results across JSON stability, latency, multimodal ingestion, and tone were clearer than I expected: &lt;a href="https://dev.to/dumebii/gemini-25-flash-vs-claude-37-sonnet-4-production-constraints-that-made-the-decision-for-me-bib"&gt;Gemini 2.5 Flash vs Claude 3.7 Sonnet: 4 Production Constraints That Made the Decision for Me&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>showdev</category>
      <category>nextjs</category>
      <category>playwright</category>
    </item>
    <item>
      <title>I vibe-coded an internal tool that slashed my content workflow by 4 hours</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Fri, 27 Feb 2026 14:52:17 +0000</pubDate>
      <link>https://dev.to/dumebii/i-vibe-coded-an-internal-tool-that-slashed-my-content-workflow-by-4-hours-310f</link>
      <guid>https://dev.to/dumebii/i-vibe-coded-an-internal-tool-that-slashed-my-content-workflow-by-4-hours-310f</guid>
      <description>&lt;p&gt;One of the biggest challenges I face as a content expert is repurposing my written blogs for social media. Before now, I had to ask AI for summaries or try to get them myself. I became very busy recently, and I don't have time for that anymore. &lt;br&gt;
The best solution for me was building a tool that helps me generate social media content from my blog and posts on my behalf. &lt;br&gt;
I was in a meeting of content professionals recently. A key point that was hammered on regarding the use of AI in content creation is the need to maintain a strict Human-in-the-Loop (HITL) workflow. &lt;br&gt;
This resonated well with me. &lt;br&gt;
I had initially planned to build an agent to automate and schedule social media posts. This, however, leaves out the HITL factor, so I restrategized. &lt;/p&gt;

&lt;p&gt;Here is the technical breakdown of how I built an Agentic Content Engine using Next.js 15, Gemini 3.1 Pro, and Discord Webhooks.&lt;/p&gt;
&lt;h2&gt;
  
  
  Agentic Human-in-the-Loop (HITL) architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Problem: The "Context Gap"&lt;/strong&gt;&lt;br&gt;
Most AI social media tools are just wrappers for generic prompts. They don't know my research, they don't know my voice, and they definitely don't know the technical nuances of my articles.&lt;br&gt;
So,&lt;br&gt;
I needed a tool that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Reads my actual dev.to articles.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Strategizes a 3-day multi-platform campaign.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Displays it in a way that I can audit, edit, and then—with one click—Deploy.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even though this app was "vibe coded" (shoutout to the AI for keeping up with my pivots 😂😂), the architecture is solid.&lt;/p&gt;

&lt;p&gt;The core philosophy of this build is Agency over Automation. The agent doesn't just act; it reasons, structures, and then waits for human approval before posting&lt;/p&gt;
&lt;h3&gt;
  
  
  The AI Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reasoning Engine:&lt;/strong&gt; Gemini 3.1 Pro (Tier 1 Billing). I opted for Pro over Flash to handle complex instruction following and strict JSON schema enforcement.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; Next.js 15 (App Router) for server-side rendering and SEO efficiency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Styling:&lt;/strong&gt; Tailwind CSS with &lt;code&gt;@tailwindcss/typography&lt;/code&gt; for professional markdown rendering.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment:&lt;/strong&gt; Discord Webhooks for an immediate, zero-auth execution pipeline.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Handling AI Hallucinations in Next.Js
&lt;/h2&gt;

&lt;p&gt;A common failure in vibe coding, I have found, is the LLM returning "chatty" text when the UI expects structured data. &lt;br&gt;
To solve this, I implemented a Strict JSON Enforcement pattern in the API route.&lt;/p&gt;

&lt;p&gt;Gemini often wraps its JSON output in markdown code blocks. If you pass this directly to &lt;code&gt;JSON.parse()&lt;/code&gt;, the app crashes.&lt;/p&gt;

&lt;p&gt;To solve this, I used &lt;em&gt;Sanitization Middleware.&lt;/em&gt;&lt;br&gt;
I built a regex-based sanitization layer to strip the noise and ensure the frontend receives a clean array.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/api/generate/route.ts&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rawOutput&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;output&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// The raw string from Gemini&lt;/span&gt;

&lt;span class="c1"&gt;// Regex to extract only the JSON content&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;cleanJson&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;rawOutput&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/``&lt;/span&gt;&lt;span class="err"&gt;`
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;endraw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nx"&gt;json&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="s2"&gt;```/g, "").trim();

try {
  const campaignData = JSON.parse(cleanJson);
  return NextResponse.json({ campaign: campaignData.campaign });
} catch (error) {
  console.error("JSON Parsing failed:", rawOutput);
  return NextResponse.json({ error: "Failed to parse Agent strategy" }, { status: 500 });
}

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  UI/UX Strategy: The Kanban "Board" Approach
&lt;/h2&gt;

&lt;p&gt;The v1 of the UI was so messy. The tool worked but you'd have to dig through mountains of text to even understand what was going on. &lt;br&gt;
I tried formatting it into a table for some structure. Somehow, that was worse! &lt;br&gt;
Finally, to optimize for a &lt;strong&gt;"Human-in-the-Loop"&lt;/strong&gt; workflow, I moved to a columnar dashboard.&lt;br&gt;
Social posts, especially threads on X, can be long, and that would have made even the boards to be clumsy and unkempt. &lt;br&gt;
To keep the UI clean, I built a &lt;code&gt;PostCard&lt;/code&gt; component that caps content at &lt;strong&gt;250 characters&lt;/strong&gt; with a state-managed "Read More" toggle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;isExpanded&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setIsExpanded&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;displayContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;isExpanded&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;250&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures the user can audit the text without scrolling for "miles."&lt;/p&gt;




&lt;h2&gt;
  
  
  Photo dump: Agentic Content Flow in Action
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;The Starting Point
Here’s the clean, minimal dashboard before the magic happens. I wanted it to feel like a professional "Command Centre," not a messy chatbot window.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvz667kmrpd5nboj21qwt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvz667kmrpd5nboj21qwt.png" alt="homepage" width="800" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The 3-Day Campaign Map
Once I paste my URL, the Agent goes to work. It returns a structured 3x3 grid. I added a 250-character truncation with a "Read More" toggle because, let's face it, nobody wants a wall of text when they're trying to strategise.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjmvcttzzxt193z17459.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjmvcttzzxt193z17459.png" alt="content generation" width="800" height="445"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Deployment
Here is the best part. I hit "Post to Discord," and boom—success. No manual copy-pasting, no switching tabs. It’s live.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3hrm5rdokebrklk6a8lr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3hrm5rdokebrklk6a8lr.png" alt="posted to discord success message" width="800" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fysle2meczj9ykxpthcgh.png" alt="discord success" width="800" height="403"&gt;
&lt;/h2&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;This is what I have built so far. I am calling it BloggerHelper v1&lt;br&gt;
My next updates are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Integrating the X and LinkedIn feature. &lt;/li&gt;
&lt;li&gt;Putting more work into the context tank. So far, the agent's context has been obtained from the article and some instructions in the agents_instruction.md file. I will work more on this&lt;/li&gt;
&lt;li&gt;Putting an edit feature, where I can edit a post before it goes out.&lt;/li&gt;
&lt;li&gt;Making it take in more context than just my blog posts&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion: The Engineering of Presence
&lt;/h2&gt;

&lt;p&gt;Even though this tool was designed to help me cut down on work hours, it was also to take me from just a technical writer to a content engineer/architect, where my primary goal isn't to just create content but create solutions that make for easy content flow.&lt;br&gt;
Also, as I position myself as an AI influencer, I want to show myself building more with AI and evangelising its adoption.&lt;/p&gt;

&lt;p&gt;Let's connect on &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What’s your take on Agentic Workflows? Are you building for full automation, or are you keeping the human in the loop?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let’s discuss below. 👇&lt;/p&gt;

&lt;h3&gt;
  
  
  UPDATE!!!!
&lt;/h3&gt;

&lt;p&gt;I just used my tool to get my social media caption/content for this post. See below.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3sjw1nvyg84389ofynqq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3sjw1nvyg84389ofynqq.png" alt="am content generator" width="800" height="455"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You can try it out &lt;a href="https://3nz2kx-3000.csb.app/" rel="noopener noreferrer"&gt;here&lt;/a&gt;, but mercy on my API credit!! &lt;/p&gt;

&lt;p&gt;UPDATE 2 — March 2026:&lt;br&gt;
Several people in the comments asked about forcing structured JSON output without the regex sanitisation layer. I ended up going deep on this for Ozigi v3. The answer is responseSchema via the Vertex AI SDK — it enforces structure at the decoding layer, not the prompt level. I benchmarked it alongside Claude 3.7 Sonnet across four production constraints. The full write-up, with numbers, is here: &lt;a href="https://dev.to/dumebii/gemini-25-flash-vs-claude-37-sonnet-4-production-constraints-that-made-the-decision-for-me-bib"&gt;Gemini 2.5 Flash vs Claude 3.7 Sonnet: 4 Production Constraints That Made the Decision for Me&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>nextjs</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Using Perplexity AI and Gemini 3 (Pro) for Academic Research and Writing</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Thu, 19 Feb 2026 15:07:41 +0000</pubDate>
      <link>https://dev.to/dumebii/using-perplexity-ai-and-gemini-3-pro-for-academic-research-cji</link>
      <guid>https://dev.to/dumebii/using-perplexity-ai-and-gemini-3-pro-for-academic-research-cji</guid>
      <description>&lt;p&gt;I’m currently in the trenches of my Master’s thesis, focusing on &lt;strong&gt;5G Anomaly Detection using TensorFlow Lite at the Edge&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;I wrote a paper on &lt;a href="https://www.globalscientificjournal.com/researchpaper/EDGE_DEPLOYABLE_TENSORFLOW_LITE_AUTOENCODER_FOR_REAL_TIME_5G_ANOMALY_DETECTION_AND_COST_AWARE_OPTIMIZATION.pdf" rel="noopener noreferrer"&gt;EDGE-DEPLOYABLE TENSORFLOW LITE AUTOENCODER FOR REAL-TIME 5G ANOMALY DETECTION AND COST-AWARE OPTIMIZATION&lt;/a&gt; that you can check out. &lt;/p&gt;




&lt;p&gt;&lt;em&gt;This blog post is part of my short-form content series. Where I write straight-to-the-point blog posts of less than 1000 words&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Before building my AI research workflow, I used to spend hours just "pre-reading," trying to build the literature review section of my thesis.&lt;/p&gt;

&lt;p&gt;Not anymore!&lt;br&gt;
I built my own "Research Stack" with already existing AI tools that does all the heavy lifting for me in a matter of minutes. &lt;/p&gt;

&lt;p&gt;I don’t use just one tool. I use an &lt;a href="https://graygrids.com/blog/ai-aggregators-multiple-models-platform" rel="noopener noreferrer"&gt;AI aggregator&lt;/a&gt; and a &lt;a href="//gemini.google.com"&gt;AI Native Pro Model&lt;/a&gt; together.&lt;/p&gt;


&lt;h2&gt;
  
  
  Perplexity is the AI Aggregator
&lt;/h2&gt;

&lt;p&gt;Many people, like me before making this discovery, think of &lt;a href="//perplexity.ai"&gt;Perplexity&lt;/a&gt; as just a model; it’s actually more of a "librarian." &lt;br&gt;
It doesn't just rely on its own model; it uses some of the best in the industry—Claude 4, GPT-5, and Gemini 3—to scour the web and find citations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh74p2vo5d4un556ni8g5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh74p2vo5d4un556ni8g5.png" alt="perplexity models" width="448" height="599"&gt;&lt;/a&gt;&lt;br&gt;
&lt;a href="https://www.perplexity.ai/hub/blog/meet-new-sonar" rel="noopener noreferrer"&gt;Sonar&lt;/a&gt; is Perplexity's own model.&lt;/p&gt;

&lt;p&gt;I've come to learn that Perplexity is the "king" of finding where the information is. &lt;/p&gt;

&lt;p&gt;However, when it comes to understanding/making sense of the 20 or so PDFs I just found? That’s where the "Aggregator" model hits a wall.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6x1s394ghc1ebvjfauj2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6x1s394ghc1ebvjfauj2.png" alt="perplexity ai interface" width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  The Native Pro Advantage (Gemini Advanced)
&lt;/h2&gt;

&lt;p&gt;Because I have a &lt;strong&gt;Gemini Pro&lt;/strong&gt; subscription, I have access to something Perplexity’s implementation can’t match: Gemini's &lt;a href="https://developers.googleblog.com/en/new-features-for-the-gemini-api-and-google-ai-studio/" rel="noopener noreferrer"&gt; 2-Million Token Context Window.&lt;/a&gt;**&lt;/p&gt;

&lt;p&gt;While Perplexity gives me snippets and links, I can feed those entire PDFs or papers it gives me into Gemini Pro. &lt;br&gt;
This way, Gemini doesn't just look up the research papers; it "lives" in them. &lt;br&gt;
That is, it remembers a conflict in data on page 4 and compares it to a conclusion on page 48.&lt;/p&gt;
&lt;h2&gt;
  
  
  My Research Workflow
&lt;/h2&gt;

&lt;p&gt;Here is exactly how I use Perplexity AI and Google Gemini to speed up my thesis research:&lt;/p&gt;
&lt;h3&gt;
  
  
  Phase 1-- Using Perplexity to find research papers and material:
&lt;/h3&gt;

&lt;p&gt;I ask Perplexity to find the most recent 2026 papers on Federated Learning in 5G. It gives me URLs and citations.&lt;br&gt;
Here's an example of my prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Find the top 5 most cited research papers from late 2025 and 2026 regarding 'Anomaly Detection in 5G Core Networks using Federated Learning.' Provide the direct URLs and a 2-sentence summary of their core methodology

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Phase 2-- Using Gemini Pro to go through research materials:
&lt;/h3&gt;

&lt;p&gt;I download those papers and upload them to Gemini and use it for things like comparing, reasoning, or critiquing. &lt;br&gt;
Here's an example prompt I've used&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;I have these 5 research papers [Paste links/sources]. Using your 2M token context, analyze how these papers address the 'latency vs. accuracy' trade-off in Edge computing. Then, draft a 1,000-word skeleton for my literature review that explains why AI automation is the solution to 5G network failures.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Phase 3: Direct editing in the Google Docs Workspace
&lt;/h3&gt;

&lt;p&gt;Since Gemini is integrated with my Google Workspace, I edit the literature review draft directly into a Google Doc.&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 Comparison: Perplexity AI vs Google Gemini for Research
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Perplexity (The Librarian)&lt;/th&gt;
&lt;th&gt;Gemini Pro (The Architect)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary Strength&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Real-time search &amp;amp; citations.&lt;/td&gt;
&lt;td&gt;Massive context &amp;amp; reasoning.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Model Source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Aggregator (Claude, GPT, Gemini).&lt;/td&gt;
&lt;td&gt;Native (Google's best).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context Window&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Small (Snippet-based).&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;2M+ Tokens&lt;/strong&gt; (Entire libraries).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For...&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Finding "The What" &amp;amp; URLs.&lt;/td&gt;
&lt;td&gt;Analyzing "The How" &amp;amp; Drafting.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web-only.&lt;/td&gt;
&lt;td&gt;Google Workspace (Docs/Gmail).&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;What I have learned in my AI use is that looking for the one tool that does everything would lead to failure. or inaccuracies. &lt;br&gt;
I prefer a "separation of concerns" type of workflow, leading to better accuracy.&lt;br&gt;
This only works, though, when you know how to build the right stack for your workflow and how to get around the stack&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are you still using a single LLM for your research, or have you started "stacking" your tools? Let's discuss in the comments!&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can find me on &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn!&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>beginners</category>
      <category>productivity</category>
      <category>tooling</category>
    </item>
    <item>
      <title>Top 5 Headless CMS to Build a Blog in 2026</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Mon, 26 Jan 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/dumebii/top-5-headless-cms-to-build-a-blog-in-2026-382f</link>
      <guid>https://dev.to/dumebii/top-5-headless-cms-to-build-a-blog-in-2026-382f</guid>
      <description>&lt;p&gt;After doing a lot of research as a technical writer, I have found the top 5 CMS for blogging. Whether you are trying to build a personalwriting repository or you are a full-fledged publication with more complex subscription needs, this article is for you!&lt;/p&gt;

&lt;p&gt;When I first got into personal blogging, creating The Handy Developer's Guide, I used &lt;a href="//wordpress.com"&gt;WordPress&lt;/a&gt;, choose a theme, did some customizations as best as my plan let me and that was it.&lt;br&gt;
Though my experience might be different from someone else's, I didn't have the best time using WordPress. One of the problems I had using WordPress was the lack of broader customization as a premium tier subscriber. If you had ever visited my blog back then, you'd know that the frontend was a cry for help. The backend was solid though.&lt;br&gt;
I am revisiting blogging again in 2026, and after a lot of research, I curated this list. This is a product of research and personal preference.  &lt;/p&gt;

&lt;h2&gt;
  
  
  What is a Headless CMS?
&lt;/h2&gt;

&lt;p&gt;A headless Content Management System (CMS) is a backend-only system where the content repository (the "body") is separated from the presentation layer/frontend (the "head").&lt;/p&gt;

&lt;p&gt;Unlike a traditional CMS like WordPress, which dictates how content looks on a website through built-in templates, a headless CMS only stores and manages raw content. This content is then delivered to any device ( a website, mobile app, or smartwatch) via an Application Programming Interface (API).&lt;/p&gt;

&lt;h2&gt;
  
  
  Difference Between a Headed (Monolithic) CMS and a Headless CMS
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Headed (Monolithic) CMS&lt;/th&gt;
&lt;th&gt;Headless CMS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;User Interface&lt;/td&gt;
&lt;td&gt;Pre-built templates and themes&lt;/td&gt;
&lt;td&gt;Build your own using React, Vue, Swift, etc.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control&lt;/td&gt;
&lt;td&gt;Marketers can drag-and-drop easily&lt;/td&gt;
&lt;td&gt;Developers have full control over the codebase&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Channels&lt;/td&gt;
&lt;td&gt;Mostly limited to websites&lt;/td&gt;
&lt;td&gt;Omnichannel delivery: web, mobile apps, IoT, digital displays&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;Harder; frontend and backend scale together&lt;/td&gt;
&lt;td&gt;Easier; frontend and backend scale independently&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Security&lt;/td&gt;
&lt;td&gt;Larger attack surface due to direct database exposure&lt;/td&gt;
&lt;td&gt;Smaller attack surface with API-only access&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F48jzu7axh10xsvtxcsin.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F48jzu7axh10xsvtxcsin.png" alt="comparison between headed and headless CMs" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What Makes a Headless CMS Great for Blogging in 2026?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Flexible Content Modeling:&lt;/strong&gt; You define your own post types, relationships, and reusable blocks. Schema evolves with your needs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;API Power:&lt;/strong&gt; REST, GraphQL, or something more? It to fetch exactly the data you need, without over-fetching or under-fetching, REST, GraphQL, GROQ and so on. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;TypeScript and SDK Quality:&lt;/strong&gt; SDKs with type safety, codegen, and auto-completion.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Editor Experience:&lt;/strong&gt; Intuitive UI for non-developers. Preview changes, collaborate in real time, and avoid constraints.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Performance and Caching:&lt;/strong&gt; The CMS play nicely with CDNs, static site generators, and incremental static regeneration, with fast content propagation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Authentication and Access Control:&lt;/strong&gt; Lock down drafts, manage roles, and integrate with SSO or OAuth.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Migration and Portability:&lt;/strong&gt; Import/export data effortlessly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With that in mind, let’s meet my top five.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Sanity: The Developer-First Content Operating System
&lt;/h2&gt;

&lt;p&gt;I am currently using &lt;a href="//sanity.io"&gt;Sanity&lt;/a&gt; to build out my blog this 2026. &lt;br&gt;
&lt;a href="//sanity.io"&gt;Sanity&lt;/a&gt; is the CMS for teams who treat content as data, not just text. It’s opinionated. If you want to model your content in code, build custom editorial interfaces, and automate workflows with AI, Sanity positions itself as a content operating system, not just a CMS.&lt;/p&gt;

&lt;p&gt;For marketers, Sanity offers a customizable Studio UI, real-time collaboration, and visual editing with live previews. For developers, it’s a schema-as-code playground with TypeScript support, and fast APIs. It has a plugin system that lets you build exactly what your team needs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fueboe301pcgw1er9zpdz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fueboe301pcgw1er9zpdz.png" alt="Sanity studio dashboard with real-time editing and preview" width="800" height="442"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;ps, this isnt my dashboard. mine is not nearly as full, but I needed something to show the full effect.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Features of Sanity CMS
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Schema-as-Code and Content Lake&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity’s core innovation is its “Content Lake”: a real-time, globally distributed datastore where content is stored as structured JSON documents. You define your content schema in JavaScript or TypeScript, version it in Git, and deploy changes like any other codebase. This means you can evolve your content model without downtime or risky migrations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Language: GROQ (and GraphQL)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity’s primary query language is &lt;a href="https://www.sanity.io/docs/content-lake/how-queries-work" rel="noopener noreferrer"&gt;GROQ&lt;/a&gt; (Graph-Relational Object Queries), a powerful, declarative language designed for content trees and references. GROQ lets you fetch exactly the shape of data you need, with projections, filters, and joins, all in a single request. For teams who prefer GraphQL, Sanity can auto-generate a GraphQL API from your schema, though some advanced features (like inline objects) may require tweaks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Performance and Caching&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity’s APIs are fast—think 32ms response times and 500+ concurrent queries per project. The &lt;strong&gt;Live CDN&lt;/strong&gt; ensures content updates propagate globally within 60 seconds, and when paired with frameworks like Next.js or Gatsby, you get near-instant cache invalidation and incremental static regeneration (ISR) for static-speed performance with real-time freshness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TypeScript and SDK Quality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity’s SDKs are robust, with TypeScript support, codegen for schemas, and CLI tools for migrations and local development. You can generate types from your schema, get auto-completion in your IDE, and even use AI agents to scaffold new content models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Studio Customization and Plugins&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity Studio is a fully customizable React app. Developers can build custom input components, document views, and tools (like SEO analyzers or campaign trackers) that integrate natively into the editor UI. Visual editing lets marketers preview and edit content directly on the live site, with drag-and-drop layouts and click-to-edit functionality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication, Roles, and Access Control&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity supports granular roles (Viewer, Contributor, Editor, Developer, Admin) and custom roles on enterprise plans. SSO integration (Okta, Azure AD, Google Workspace) is available for larger teams, and access can be scoped down to datasets or even individual documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration and Import/Export&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity provides CLI tools and migration guides for moving from legacy CMSs (like WordPress or Drupal) to structured content. You can script migrations, map fields, and transform HTML blobs into Portable Text (Sanity’s rich text format).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security and Compliance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Sanity is SOC2 Type 2 certified, GDPR compliant, and offers EU data residency. Daily backups and audit logs are available on enterprise plans. Note: HIPAA and ISO certifications are inherited from Google Cloud Platform, not held directly.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Contentful: The Enterprise Digital Experience Platform
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.contentful.com/" rel="noopener noreferrer"&gt;Contentful&lt;/a&gt; is the “safe” choice for enterprises. It’s a mature, cloud-based platform with robust APIs, a polished UI, and a massive ecosystem of integrations. You get structured content, localization, analytics, and personalization tools out of the box. &lt;a href="https://www.contentful.com/" rel="noopener noreferrer"&gt;Contentful&lt;/a&gt; is designed for global teams managing complex, multi-language, multi-channel content.&lt;/p&gt;

&lt;p&gt;For marketers, Contentful offers a user-friendly editor, localization workflows, and built-in analytics. For developers, it provides REST and GraphQL APIs, SDKs for every major language, and a stable, scalable infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fis2artgfhlf2e6lnikmn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fis2artgfhlf2e6lnikmn.png" alt="Screenshot of Contentful’s web app showing content modeling and localization features" width="800" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Features of Contentful
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Content Modeling and API-First Design&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful lets you define custom content types (e.g., BlogPost, Author, Category) via a web UI or API. Each type has fields (text, media, references), and you can model relationships between entries. The API-first approach means every piece of content is accessible via REST or GraphQL endpoints, making it easy to power websites, apps, and more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Languages: REST and GraphQL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful supports both REST and GraphQL APIs. The REST API is stable, cache-friendly, and widely supported. The GraphQL API allows you to fetch exactly the fields you need, reducing over-fetching and improving performance for complex frontends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TypeScript and SDK Quality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful’s JavaScript SDK is now written in TypeScript, providing strong type safety, auto-completion, and codegen for your content models. You can generate TypeScript types from your schema, get autosuggestions for queries, and chain client modifiers for localization and link resolution.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Localization and Personalization&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful excels at localization: you can manage content in dozens of languages, with field-level translations and fallback strategies. The platform also supports personalization experiments, letting you A/B test content variants and track performance with built-in analytics.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caching, CDN, and Performance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful uses a global CDN to deliver content with low latency. API responses are cacheable, and you can use webhooks to trigger static site rebuilds or incremental static regeneration (ISR) in frameworks like Next.js. Asset delivery (images, files) is optimized via a dedicated asset API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication, Roles, and Access Control&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful offers granular roles and permissions, SSO integration, and API tokens for secure access. You can control who can edit, publish, or view content at the space or environment level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ecosystem and Integrations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful’s App Framework and Marketplace provide hundreds of integrations: e-commerce, analytics, marketing automation, and more. Webhooks and APIs make it easy to connect to CI/CD pipelines, static site generators, or custom workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration and Import/Export&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful provides CLI tools for importing/exporting content, migrating schemas, and syncing environments. However, some users report friction when evolving content models at scale, due to rigid entry-reference structures and API rate limits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security and Compliance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contentful is enterprise-ready: SOC2, GDPR, and ISO 27001 certified, with audit logs, SSO, and data residency options. Enterprise plans offer dedicated infrastructure and SLAs.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Strapi: The Open-Source, Self-Hosted Powerhouse
&lt;/h2&gt;

&lt;p&gt;&lt;a href="//strapi.io"&gt;Strapi&lt;/a&gt; is the CMS for developers who want total control. It’s open-source, built on Node.js, and can be self-hosted anywhere: from your laptop to AWS, DigitalOcean, or Strapi Cloud. You get a slick admin UI, auto-generated REST and GraphQL APIs, and a thriving plugin ecosystem.&lt;/p&gt;

&lt;p&gt;For marketers, &lt;a href="//strapi.io"&gt;Strapi&lt;/a&gt; offers a user-friendly editor and customizable workflows. For developers, it’s a playground for custom APIs, plugins, and integrations, with no vendor lock-in.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffg9q7zeqi8v1uyzea004.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffg9q7zeqi8v1uyzea004.png" alt="Screenshot of Strapi’s admin panel showing content types and API endpoints" width="800" height="507"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Features of Strapi
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Content Modeling and API Generation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi lets you define content types (e.g., Article, Author, Tag) via a visual builder or code. Each type becomes an auto-generated REST and/or GraphQL endpoint, complete with CRUD operations. You can customize controllers, services, and routes as needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Languages: REST and GraphQL&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;By default, Strapi exposes REST APIs for every content type. The optional GraphQL plugin adds a powerful Apollo-based endpoint, with schema auto-generation, custom resolvers, and a built-in playground for testing queries and mutations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TypeScript and SDK Quality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi’s core is now TypeScript-friendly, with type definitions, codegen, and strong typing for plugins and customizations. The community maintains SDKs for JavaScript, TypeScript, and popular frontend frameworks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication, Roles, and Access Control&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi ships with robust role-based access control (RBAC), JWT authentication, and support for social logins (Google, Facebook, etc.). You can define granular permissions for public, authenticated, and custom roles, and integrate with external identity providers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plugin Ecosystem and Extensibility&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi’s plugin system is a major strength: over 350 plugins cover everything from SEO and image optimization to custom fields, webhooks, and integrations. You can build your own plugins or extend the admin UI with React components.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment and Hosting&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi can be self-hosted on any Node.js-compatible environment, or deployed to Strapi Cloud for managed hosting. You control the database (PostgreSQL, MySQL, MongoDB, SQLite), asset storage, and infrastructure. This means you’re responsible for scaling, backups, and security, but also free from SaaS constraints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caching, CDN, and Performance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Performance depends on your hosting setup. Strapi supports CDN integration for assets, reverse proxies (NGINX, Traefik), and Redis or in-memory caching for APIs. Strapi Cloud includes a global CDN and DDoS protection on paid plans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration and Import/Export&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi provides CLI tools for migrating content, syncing environments, and exporting/importing data. You can script migrations, seed databases, and automate schema changes as part of your CI/CD pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security and Compliance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Strapi is SOC2 and GDPR compliant, with regular security audits and a transparent open-source codebase. You’re responsible for patching, SSL, and compliance when self-hosting.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Ghost: The Writer’s Blogging Platform, Now Headless
&lt;/h2&gt;

&lt;p&gt;I was unable to get far with exploring &lt;a href="//ghost.org"&gt;Ghost&lt;/a&gt;. &lt;br&gt;
To "Get Started" on Ghost, you have to create an account and verify you are human by adding your bank details.&lt;br&gt;
In this &lt;a href="https://ghost.org/vs/substack/" rel="noopener noreferrer"&gt;Comparison article with substack&lt;/a&gt;, the Ghost said it supported 135 global currencies and accepted all international payment methods. Unfortunately (?), the Nigerian Naira isn't on the list. (we meuveee).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqaaba1rftosdqjdwtwfh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqaaba1rftosdqjdwtwfh.png" alt="ghost.org rejecting naira cards" width="800" height="426"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a purely research-based piece on Ghost.&lt;/p&gt;

&lt;p&gt;&lt;a href="//ghost.org"&gt;Ghost&lt;/a&gt; started as a minimalist, open-source blogging platform, and in 2026, it’s evolved into a modern, headless CMS with a focus on publishing, newsletters, and paid memberships. Ghost is beloved by writers, journalists, and indie publishers who want speed, SEO, and full control without the plugin sprawl of WordPress or the lock-in of Substack (there's an entire article about this!).&lt;/p&gt;

&lt;p&gt;For marketers, &lt;a href="//ghost.org"&gt;Ghost&lt;/a&gt; offers a clean editor, built-in SEO, and audience growth tools. For developers, it provides a RESTful Content API, webhooks, and the option to self-host or use Ghost(Pro) for managed hosting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Features of Ghost CMS
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Content Modeling and API&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost’s content model is opinionated: you get Posts, Pages, Tags, Authors, and Members (for subscriptions). It’s not as flexible as Sanity or Strapi, but it’s perfect for blogs, newsletters, and publications. The RESTful Content API delivers published content in JSON, with endpoints for posts, pages, tags, authors, and settings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Editor Experience&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost’s Koenig editor is Markdown-first, with support for rich embeds, images, and custom cards. The UI is distraction-free, fast, and optimized for writing flow. Memberships, comments, and newsletters are built in, no plugins required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TypeScript and SDK Quality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost’s JavaScript SDK wraps the REST API, making it easy to fetch content from any frontend (Next.js, SvelteKit, etc.). TypeScript support is solid, with type definitions and codegen for API responses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication, Roles, and Access Control&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost supports staff roles (Author, Editor, Admin, Owner) and member roles (free, paid, custom tiers). Access control is simple: staff manage content, members access gated posts and newsletters. SSO and advanced RBAC are available on higher plans.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caching, CDN, and Performance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost(Pro) includes a global CDN, DDoS protection, and automatic caching for assets and API responses. Self-hosted Ghost can be paired with NGINX, Cloudflare, or any CDN for optimal performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deployment and Hosting&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost can be self-hosted (Node.js, MySQL, NGINX) or run on Ghost(Pro) for managed hosting. Ghost(Pro) handles updates, backups, SSL, and scaling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration and Import/Export&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost provides import/export tools for posts, members, and settings. You can migrate from WordPress, Substack, or other platforms with minimal friction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security and Compliance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ghost(Pro) is GDPR compliant, with enterprise-grade security, two-factor authentication, and SSO on business plans. Self-hosted users are responsible for patching and compliance.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Hygraph: The GraphQL-Native, API-First CMS
&lt;/h2&gt;

&lt;p&gt;Hygraph (formerly GraphCMS) is the CMS for developers who love GraphQL. It’s API-first, SaaS-based, and designed for omnichannel content delivery.  Hygraph is popular with teams building complex apps, e-commerce sites, and multi-platform experiences.&lt;/p&gt;

&lt;p&gt;For marketers, Hygraph offers a visual editor, localization, and workflow tools. For developers, it’s a GraphQL playground with flexible schema modeling, content federation, and strong TypeScript support.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjrs4qe9ah07j44vzlnh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffjrs4qe9ah07j44vzlnh.png" alt="Screenshot of hygraph's content dashboard" width="800" height="416"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Features of Hygraph
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Content Modeling and GraphQL APIs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph lets you define content models (types, fields, relationships) via a visual builder or API. Every model is exposed as a GraphQL endpoint, with auto-generated queries, mutations, and filtering. You can federate content from remote sources, enabling unified querying across multiple backends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Language: GraphQL (Only)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph is GraphQL-native—there’s no REST API. This means you get precise, efficient queries, strong typing, and compatibility with modern frontend frameworks (Next.js, SvelteKit, etc.). The Management SDK allows programmatic schema changes and migrations, with full TypeScript support.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TypeScript and SDK Quality&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph provides developer-friendly SDKs, codegen for TypeScript types, and CLI tools for migrations and environment management. The API playground makes it easy to test queries and mutations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Authentication, Roles, and Access Control&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph supports role-based access control (RBAC), OAuth authentication, SSO, audit logs, and custom roles. You can manage permissions at the model, field, or environment level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Caching, CDN, and Performance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph delivers content via a global CDN, with middle-layer caching and predictable payloads. Performance is strong, with low latency and high concurrency limits on enterprise plans. You can use static site generators or ISR for optimal speed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ecosystem and Integrations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph integrates with CRM, analytics, personalization, commerce, and marketing automation tools. The Marketplace offers ready-made apps and UI extensions, and webhooks enable custom workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migration and Import/Export&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph supports GraphQL mutations for bulk imports, content federation for remote data, and CLI tools for schema migrations. Migration guides and support are available for onboarding.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security and Compliance&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hygraph is GDPR, CCPA, SOC2, and ISO 27001 compliant, with enterprise-grade security, data encryption, and manual/off-site backups.&lt;/p&gt;




&lt;h2&gt;
  
  
  Comparison Table: The Top 5 Headless CMS for Blogging in 2026
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CMS&lt;/th&gt;
&lt;th&gt;Free Tier Limits&lt;/th&gt;
&lt;th&gt;Query Language(s)&lt;/th&gt;
&lt;th&gt;Best Use-Case&lt;/th&gt;
&lt;th&gt;Developer Experience Score (1–10)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sanity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unlimited admin users, free content updates, pay-as-you-go for API overages&lt;/td&gt;
&lt;td&gt;GROQ, GraphQL, REST&lt;/td&gt;
&lt;td&gt;Structured content, automation, multi-channel&lt;/td&gt;
&lt;td&gt;9.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Contentful&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5 users, 100,000 API calls/mo, 50GB asset bandwidth&lt;/td&gt;
&lt;td&gt;REST, GraphQL&lt;/td&gt;
&lt;td&gt;Enterprise DXP, localization, integrations&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Strapi&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free (self-hosted); Cloud: 2,500 API req/mo, 10GB storage&lt;/td&gt;
&lt;td&gt;REST, GraphQL&lt;/td&gt;
&lt;td&gt;Open-source, self-hosting, plugins&lt;/td&gt;
&lt;td&gt;8.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ghost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$15/mo for 1,000 members, unlimited posts/emails&lt;/td&gt;
&lt;td&gt;REST&lt;/td&gt;
&lt;td&gt;Blogging, newsletters, memberships&lt;/td&gt;
&lt;td&gt;7.5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hygraph&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2 locales, 3 users, unlimited assets, 10KB query&lt;/td&gt;
&lt;td&gt;GraphQL&lt;/td&gt;
&lt;td&gt;GraphQL-native, content federation, omnichannel&lt;/td&gt;
&lt;td&gt;8.5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Developer Experience Score&lt;/strong&gt; is a subjective rating based on TypeScript support, SDK quality, local dev workflows, and extensibility.&lt;/p&gt;




&lt;h2&gt;
  
  
  Final Thoughts: Which Headless CMS Should You Choose in 2026?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Choose Sanity&lt;/strong&gt; if you want a developer-first, future-proof CMS with real-time collaboration, schema-as-code, and deep customization. It’s the best choice for teams who treat content as data and want to automate, scale, and innovate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose Contentful&lt;/strong&gt; if you’re an enterprise with global teams, complex localization, and a need for stability, integrations, and analytics. It’s the “safe” choice, but be prepared for higher costs and some rigidity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose Strapi&lt;/strong&gt; if you want open-source freedom, self-hosting, and total control over your backend. It’s ideal for developers who want to own their stack and avoid SaaS lock-in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose Ghost&lt;/strong&gt; if you’re a writer, blogger, or indie publisher who wants a fast, SEO-friendly, and distraction-free platform with built-in memberships and newsletters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose Hygraph&lt;/strong&gt; if you’re building complex, API-driven apps with heavy GraphQL usage, content federation, and global scale. It’s the top pick for GraphQL-first teams.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flrgyopm32t5t68g7gz9w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flrgyopm32t5t68g7gz9w.png" alt="Flowchart guiding users to the right CMS based on team size, technical skills, and content needs" width="800" height="436"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's connect on &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn&lt;/a&gt;!&lt;/p&gt;

</description>
      <category>typescript</category>
      <category>beginners</category>
      <category>webdev</category>
      <category>writing</category>
    </item>
    <item>
      <title>From Vibe Coding to Engineering: Building a Production-Ready Next.js 15 Blog with AI</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Wed, 21 Jan 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/dumebii/the-ultimate-prompt-strategy-how-to-vibe-code-production-ready-websites-4e9</link>
      <guid>https://dev.to/dumebii/the-ultimate-prompt-strategy-how-to-vibe-code-production-ready-websites-4e9</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkqcp9m770lpfyas2dy6y.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkqcp9m770lpfyas2dy6y.gif" alt="Snow white dusting a cupboard" width="498" height="284"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Dusting off the room because it has &lt;em&gt;been a minute or two since I was in here last!&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Through last year, I ran a lot of vibe-coded projects. Most were for writing demos, others were simply for the fun of it. &lt;br&gt;
However, with each new vibe-coded project, I kept getting super frustrated and super stuck with debugging AI's badly written (spaghetti) code.&lt;/p&gt;

&lt;p&gt;"Vibe Coding" has been the trend of the moment. The idea to me was basically, "Describe your app in plain English, and the AI handles the syntax." This was the approach I kept using that kept failing until now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Vibe Coding is Ineffective
&lt;/h2&gt;

&lt;p&gt;VIbe coding is ineffective because most people treat AI like it's magic. They ask it for a feature, paste the code, and hope for the best. Usually, they get a messy file structure, insecure code, and a maintenance nightmare. The application might work on &lt;code&gt;localhost&lt;/code&gt;, but it lacks the rigor required for the real world.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Vibe-coded My Blog
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The Goal&lt;/strong&gt;&lt;br&gt;
I wanted a technical blog that was "up to standard and safe." Coming from Wordpress, where I built my blog (The Handy Developer's Guide) and lived on for the better part of a year and half, I wanted a platform I could own completely, built with modern engineering standards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Solution&lt;/strong&gt;&lt;br&gt;
I didn't just ask the AI for code; I managed it. I adopted the mindset of a &lt;strong&gt;Senior Architect&lt;/strong&gt; and treated the AI as my junior developer. &lt;br&gt;
By enforcing strict constraints and architectural patterns, I used vibe coding to build a secure, production-ready application.&lt;br&gt;
The image below is where I started with Gemini. But it gets better down the line. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ldruwt9no5g41kwg5i2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7ldruwt9no5g41kwg5i2.png" alt="good AI prompt for vibe coding" width="800" height="394"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Steps to Vibe Code a Production-Ready App
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Defining the Architecture of Your Project
&lt;/h3&gt;

&lt;p&gt;Before writing a single line of code, I had to define the stack. A standard AI prompt might suggest a generic React app or a rigid site builder. That was not enough.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Decision&lt;/strong&gt;&lt;br&gt;
I chose a &lt;strong&gt;Headless Architecture&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; Next.js 15 (App Router)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo9zpnyax179asqhufovi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo9zpnyax179asqhufovi.png" alt="frontend choice" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend:&lt;/strong&gt; &lt;a href="//sanity.io"&gt;Sanity&lt;/a&gt; (Headless CMS)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Styling:&lt;/strong&gt; Tailwind CSS (v4)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why I Used Sanity as a Headless CMS to Build My Blog
&lt;/h3&gt;

&lt;p&gt;Separation of concerns is critical for long-term survival. With this architecture, I own the code, and I own the content.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Portability:&lt;/strong&gt; If I want to change the design next year, I don't lose my posts. They live safely in Sanity's database.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security:&lt;/strong&gt; There is no exposed database or admin panel for hackers to target on the frontend.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; Next.js allows for Static Site Generation (SSG), meaning my pages load instantly, together with Sanity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;&lt;br&gt;
I did not let the AI pick the stack; I picked the stack, then told the AI how to build it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Fixing AI Hallucinations in Next.js 15
&lt;/h3&gt;

&lt;p&gt;The quality of the output depends entirely on the constraints of the input. I didn't just say, "Make a blog." I assigned a role and a standard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Trick&lt;/strong&gt;&lt;br&gt;
I used a "System Prompt" strategy to set the ground rules before any code was written.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Prompt&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqggqgaje6t2ycx2xoj62.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqggqgaje6t2ycx2xoj62.png" alt="good prompt engineering for vibe coding" width="800" height="392"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The idea was to have one tab of Gemini 3 acting as the senior developer/project manager, while another tab acted as the engineer/dev on ground. &lt;br&gt;
So, I got tab A to give me the high-level prompts after already explaining i=to it its role.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Result&lt;/strong&gt;&lt;br&gt;
The AI didn't dump files in the root directory. It set up a professional folder structure (&lt;code&gt;lib/&lt;/code&gt;, &lt;code&gt;components/&lt;/code&gt;, &lt;code&gt;types/&lt;/code&gt;) and automatically created a &lt;code&gt;.env.local&lt;/code&gt; file for credentials. By explicitly banning &lt;code&gt;any&lt;/code&gt; types, the AI was forced to write interface definitions for my Post and Author schemas, preventing runtime crashes later.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F586m48zjgcqs10e1oz19.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F586m48zjgcqs10e1oz19.png" alt="AI acting like a project manager" width="800" height="532"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: How to Stop AI From Hardcoding API Keys.
&lt;/h3&gt;

&lt;p&gt;Initially, I spun up a standalone Sanity Studio. I quickly realized this created redundancy—I didn't want to manage two separate projects. I directed the AI to refactor the architecture, merging the CMS directly into the Next.js application using an &lt;strong&gt;Embedded Studio&lt;/strong&gt;. &lt;br&gt;
This is how we managed it. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva8rshbqrnnx31fz6joe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fva8rshbqrnnx31fz6joe.png" alt="I tell AI my mistake" width="800" height="613"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zp89jg5h8n7sc99zfex.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5zp89jg5h8n7sc99zfex.png" alt="AI fixes coding error" width="800" height="689"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Result&lt;/strong&gt;&lt;br&gt;
I had a working CMS living independently at &lt;code&gt;/studio&lt;/code&gt; before I even had a homepage. This allowed me to write and structure content immediately, giving the frontend real data to fetch during development.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Using AI to Fix the Errors it Generated
&lt;/h3&gt;

&lt;p&gt;AI is not perfect. Even with a great prompt (I'd know), "hallucinations" happen. I had to do my fair share of debugging, but they were more minor than I remember vibe-coded errors to be.&lt;br&gt;
We hit two major roadblocks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bug 1: The Route Group Conflict&lt;/strong&gt;&lt;br&gt;
I moved my layout files into a &lt;code&gt;(blog)&lt;/code&gt; route group to organize the code (this was totally my choice, by the way; even though the Project Manager tab suggested it, it said it was optional). Suddenly, "the internet broke." In my terminal, I got error messages about missing tags.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Issue:&lt;/strong&gt; The AI had created a layout hierarchy where the root &lt;code&gt;layout.tsx&lt;/code&gt; was missing the essential &lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;body&amp;gt;&lt;/code&gt; tags because I had moved them into the child group.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Fix:&lt;/strong&gt; We refactored the hierarchy. I established a "Root Layout" for the HTML shell and a "Blog Layout" for the Navbar and Footer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbap876jdk1x9qb9o9oxi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbap876jdk1x9qb9o9oxi.png" alt="AI fixes" width="800" height="521"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bug 2: The "Broken Image" Saga&lt;/strong&gt;&lt;br&gt;
The homepage rendered, but every image was a broken icon. The URL looked correct, but the browser refused to load it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Issue:&lt;/strong&gt; I already knew this was a security feature, not a bug. Next.js blocks external images by default to prevent malicious injection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Fix:&lt;/strong&gt; I didn't panic. I just checked the configuration. I prompted the project manager tab to update &lt;code&gt;next.config.ts&lt;/code&gt; to explicitly whitelist &lt;code&gt;cdn.sanity.io&lt;/code&gt;. One server restart later, the images appeared.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Lesson&lt;/strong&gt;&lt;br&gt;
AI writes the code, but you have to check the config. And sometimes, you just have to turn it off and on again.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Refining the UI in a Vibe Coding Project (current phase)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Design&lt;/strong&gt;&lt;br&gt;
We moved from a sort of skeleton UI to a professional UI. We implemented a "Glassmorphism" navbar with a blur effect and switched to a high-quality typography pairing (Inter for UI, Playfair Display for headings).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzezgyha1e1z1qwvn12mq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzezgyha1e1z1qwvn12mq.png" alt="AI UI change prompt" width="800" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Check If Your Blog is Up To Standard
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SEO&lt;/strong&gt;&lt;br&gt;
"A blog that doesn't rank is a diary," said someone really famous. &lt;br&gt;
I had the AI to implement &lt;strong&gt;Dynamic Metadata&lt;/strong&gt;. &lt;br&gt;
We used the &lt;code&gt;generateMetadata&lt;/code&gt; function to automatically pull the SEO title, description, and OpenGraph images from Sanity. Now, every link shared on social media looks professional.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Analytics&lt;/strong&gt;&lt;br&gt;
I wanted to know if people were reading, but I didn't want to invade their privacy, so we integrated Vercel Analytics, a privacy-friendly tracker that gives me the data I need without the cookie banners users hate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Proof&lt;/strong&gt;&lt;br&gt;
I ran a Google Lighthouse audit on the production build to verify our "Senior Architect" standards. The results spoke for themselves:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Accessibility:&lt;/strong&gt; 100&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best Practices:&lt;/strong&gt; 96&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SEO:&lt;/strong&gt; 100&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F979bhtsf3o85d9cassp2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F979bhtsf3o85d9cassp2.png" alt="Google lighthouse score" width="800" height="249"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;My project manager assured me that this was a good score, especially seeing as my blog is not yet live. Getting it live will increase the score.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion:
&lt;/h2&gt;

&lt;p&gt;I haven't launched the blog yet because I still have some work to do on it. I haven't properly tested it yet. &lt;br&gt;
Having been writing articles recently on Playwright, I have learnt how to do extensive searches, simulating different browser and network conditions. &lt;br&gt;
In due time, though, the blog will be launched. &lt;br&gt;
I wrote this article because I wanted to share an update on one of the things I have been working on so far and how AI has helped me.&lt;/p&gt;

&lt;p&gt;Let me know what you think of my journey so far. &lt;br&gt;
Do you have any Vibe coding best practices? &lt;br&gt;
Do you think I am wasting my time and should learn actual programming skills?&lt;/p&gt;

&lt;p&gt;No matter your opinions, we want to hear them!!&lt;/p&gt;

&lt;p&gt;Find me on &lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;LinkedIn&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>vibecoding</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Introduction to AI Agents: A Technical Overview for Beginners</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Thu, 04 Dec 2025 10:53:29 +0000</pubDate>
      <link>https://dev.to/dumebii/introduction-to-ai-agents-a-technical-overview-for-developers-4ih4</link>
      <guid>https://dev.to/dumebii/introduction-to-ai-agents-a-technical-overview-for-developers-4ih4</guid>
      <description>&lt;p&gt;Artificial intelligence has shifted from static prompt–response patterns to systems capable of taking structured actions. These systems are known as &lt;a href="https://cloud.google.com/discover/what-are-ai-agents" rel="noopener noreferrer"&gt;&lt;strong&gt;AI agents&lt;/strong&gt;&lt;/a&gt;. Although the term is often stretched in marketing, the underlying architecture is practical and grounded in well-understood software principles.&lt;/p&gt;

&lt;p&gt;_I took the 5-day AI agents intensive course with Google and Kaggle, and I promised myself to document what I learned each day. &lt;br&gt;
It's been a while since I took the course, but I have been putting off writing this article. _&lt;/p&gt;

&lt;p&gt;This article will be part of a 5-part series where I go through each day with what I have learned and share my knowledge with you. &lt;/p&gt;

&lt;p&gt;Now, this article outlines the foundational concepts needed to build an AI agent. It also sets the stage for subsequent posts that will explore implementation details, tool integration, orchestration, governance, and evaluation. This is Day One of a multi-part technical series.&lt;/p&gt;


&lt;h2&gt;
  
  
  What Is an AI Agent?
&lt;/h2&gt;

&lt;p&gt;Technically, an AI agent is a &lt;strong&gt;software system that uses a language model, tools, and state management to complete a defined objective&lt;/strong&gt;.&lt;br&gt;
It operates through a controlled cycle of reasoning and action, instead of remaining a passive text generator.&lt;/p&gt;
&lt;h4&gt;
  
  
  A typical AI agent includes:
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;model&lt;/strong&gt; for reasoning&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;set of tools&lt;/strong&gt; for retrieving information or executing operations&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;orchestrator&lt;/strong&gt; that manages the interaction between the model and those tools&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;deployment layer&lt;/strong&gt; for running the system at scale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This structure turns a model from a text interface into an operational component that can support business processes or technical workflows.&lt;/p&gt;


&lt;h2&gt;
  
  
  The AI Agent Workflow: The Think–Act–Observe Cycle
&lt;/h2&gt;

&lt;p&gt;All agent systems follow a predictable control loop.&lt;br&gt;
This loop is essential because it governs correctness, safety, and resource usage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Mission Acquisition&lt;/strong&gt;&lt;br&gt;
The system receives a task, either from a user request or an automated trigger.&lt;br&gt;
Example: “Retrieve the status of order #12345.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Context Assessment&lt;/strong&gt;&lt;br&gt;
The agent evaluates available information:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prior messages&lt;/li&gt;
&lt;li&gt;Stored state&lt;/li&gt;
&lt;li&gt;Tool definitions&lt;/li&gt;
&lt;li&gt;Policy rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Reasoning Step&lt;/strong&gt;&lt;br&gt;
The model generates a plan.&lt;br&gt;
Example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Identify the correct tool for order lookup&lt;/li&gt;
&lt;li&gt;Identify the tool for shipping data retrieval&lt;/li&gt;
&lt;li&gt;Determine response structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;4. Action Execution&lt;/strong&gt;&lt;br&gt;
The orchestrator calls the selected tool with validated parameters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Observation and Iteration&lt;/strong&gt;&lt;br&gt;
The agent incorporates tool output back into its context, reassesses the task, and continues until completion or termination.&lt;/p&gt;

&lt;p&gt;This controlled loop prevents uncontrolled behavior and supports predictable outcomes in production systems.&lt;/p&gt;


&lt;h2&gt;
  
  
  Core Architecture of an AI Agent System
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. Model Layer
&lt;/h3&gt;

&lt;p&gt;The model performs all reasoning.&lt;br&gt;
Selection depends on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Latency requirements&lt;/li&gt;
&lt;li&gt;Cost boundaries&lt;/li&gt;
&lt;li&gt;Task complexity&lt;/li&gt;
&lt;li&gt;Input/output formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Multiple models may be used for routing, classification, or staging tasks.&lt;br&gt;
However, initial implementations usually rely on a single model for simplicity.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Tool Layer
&lt;/h3&gt;

&lt;p&gt;Tools provide operational capability.&lt;br&gt;
A tool is a &lt;strong&gt;function with strict input/output schemas&lt;/strong&gt; and clear documentation.&lt;br&gt;
They fall into categories such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data retrieval (APIs, search functions, database operations)&lt;/li&gt;
&lt;li&gt;Data manipulation (formatting, filtering, transformation)&lt;/li&gt;
&lt;li&gt;Operational actions (ticket creation, notifications, calculations)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Effective tool design keeps actions narrow, predictable, and well-documented.&lt;br&gt;
Tools form the “action surface” of the agent and determine how reliably the system can complete assigned objectives.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Orchestration Layer
&lt;/h3&gt;

&lt;p&gt;This layer supervises the system. It is responsible for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Running the reasoning loop&lt;/li&gt;
&lt;li&gt;Applying system rules&lt;/li&gt;
&lt;li&gt;Tracking state&lt;/li&gt;
&lt;li&gt;Managing tool invocation&lt;/li&gt;
&lt;li&gt;Handling errors&lt;/li&gt;
&lt;li&gt;Regulating cost and step limits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is also the layer where developers define the agent’s operational scope and boundaries.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. Deployment Layer
&lt;/h3&gt;

&lt;p&gt;An agent becomes useful only when deployed as a service.&lt;br&gt;
A typical deployment includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An API interface&lt;/li&gt;
&lt;li&gt;Logging and observability&lt;/li&gt;
&lt;li&gt;Access controls&lt;/li&gt;
&lt;li&gt;Storage for session data or long-term records&lt;/li&gt;
&lt;li&gt;Continuous integration workflows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This layer ensures the agent behaves as a reliable software component rather than a prototype.&lt;/p&gt;


&lt;h2&gt;
  
  
  Capability Levels in AI Agents
&lt;/h2&gt;

&lt;p&gt;Understanding agent capability levels helps to set realistic expectations.&lt;/p&gt;
&lt;h3&gt;
  
  
  Level 0: Model-Only Systems
&lt;/h3&gt;

&lt;p&gt;The model answers queries without tools or memory.&lt;br&gt;
Suitable for text generation or explanation tasks.&lt;/p&gt;
&lt;h3&gt;
  
  
  Level 1: Tool-Connected Systems
&lt;/h3&gt;

&lt;p&gt;The model uses a small set of tools to complete direct tasks.&lt;br&gt;
Example: Querying external APIs for factual information.&lt;/p&gt;
&lt;h3&gt;
  
  
  Level 2: Multi-Step Systems
&lt;/h3&gt;

&lt;p&gt;The agent performs planning and executes sequences of tool calls.&lt;br&gt;
This level supports tasks that require intermediate decisions.&lt;/p&gt;
&lt;h3&gt;
  
  
  Level 3: Multi-Agent Systems
&lt;/h3&gt;

&lt;p&gt;Two or more agents collaborate.&lt;br&gt;
A coordinator routes tasks to specialized agents based on capability or domain.&lt;/p&gt;
&lt;h3&gt;
  
  
  Level 4: Self-Improving Systems
&lt;/h3&gt;

&lt;p&gt;Agents that can create new tools or reconfigure workflows based on observed gaps.&lt;br&gt;
Primarily research-grade today.&lt;/p&gt;


&lt;h2&gt;
  
  
  Building Your Practical First Agent
&lt;/h2&gt;

&lt;p&gt;Developers do not need a complex system to get a simple agent running.&lt;br&gt;
A small, well-defined project is just okay for understanding the architecture.&lt;/p&gt;

&lt;p&gt;Keep in mind that I ran all this code in &lt;a href="https://www.kaggle.com/code" rel="noopener noreferrer"&gt;Kaggle's Notebook&lt;/a&gt; and we used Google's Gemini for the project. The screenshots accompanying the code blocks are from my own effort.&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1. Configure Your Gemini API Key
&lt;/h3&gt;

&lt;p&gt;Every ADK project must expose your Gemini API key to the runtime. This block sets the key as an environment variable, which the ADK automatically detects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="c1"&gt;# Replace with your actual key or load it from your environment manager
&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GOOGLE_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY_HERE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API key configured.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgm47vwl6398gpdhfkxfk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgm47vwl6398gpdhfkxfk.png" alt="step one" width="800" height="256"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 2. Import ADK Core Components
&lt;/h3&gt;

&lt;p&gt;These are the foundational ADK modules we'll interact with: agent definitions, model bindings, runtimes, and built-in tools. This is the minimum import set required to stand up a functional agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.models.google_llm&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Gemini&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.runners&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;InMemoryRunner&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.adk.tools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;google_search&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;google.genai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 3. Optional: Retry Settings
&lt;/h3&gt;

&lt;p&gt;LLM APIs occasionally return transient errors under heavy load. The retry configuration defines a standard exponential backoff strategy so your agent can recover automatically without failing user tasks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;retry_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;HttpRetryOptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;exp_base&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;initial_delay&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;http_status_codes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;504&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 4. Define Your First Agent
&lt;/h3&gt;

&lt;p&gt;This is the most important construct. An agent is defined by its behavior (instruction), identity (name/description), model, and available tools. The structure below is portable across any environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;helpful_assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A simple agent that can answer general questions.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;Gemini&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.5-flash-lite&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;retry_options&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;retry_config&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;instruction&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant. Use web search for current information.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;google_search&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F47qdawy5yhgvid4wlg9n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F47qdawy5yhgvid4wlg9n.png" alt="Step 4" width="800" height="266"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 5. Create a Runner
&lt;/h3&gt;

&lt;p&gt;The Runner orchestrates conversations, tool calls, and message history. For prototyping, &lt;code&gt;InMemoryRunner&lt;/code&gt; is the simplest option because it requires no infrastructure or persistent storage.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;runner&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;InMemoryRunner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;root_agent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Step 6. Run Your Agent
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;run_debug()&lt;/code&gt; executes a complete agent cycle—thought generation, tool selection, action execution, and final synthesis. This is the quickest way to validate that your agent is correctly wired.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What is Google&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s Agent Development Kit? What languages are supported?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwuhk6nrn6244pf8ytfcd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwuhk6nrn6244pf8ytfcd.png" alt="Step 6" width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 7. Try a Query That Requires Live Information
&lt;/h3&gt;

&lt;p&gt;This example demonstrates that the agent will automatically invoke the Google Search tool when the prompt requires real-time information not contained in the model’s training data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_debug&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the weather in London right now?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19je34k9tzxulgl26hfj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F19je34k9tzxulgl26hfj.png" alt="Step 7" width="800" height="277"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  Step 8. Scaffold an ADK Project Folder (Optional)**
&lt;/h3&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Explanation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;ADK includes a CLI for generating full project scaffolds. This is useful when you're ready to move from experimentation into an actual multi-file agent application.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;adk create sample-agent &lt;span class="nt"&gt;--model&lt;/span&gt; gemini-2.5-flash-lite &lt;span class="nt"&gt;--api_key&lt;/span&gt; &lt;span class="nv"&gt;$GOOGLE_API_KEY&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  &lt;strong&gt;Step 9. Launch the ADK Web UI (Optional)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The ADK Web UI is a local development interface for inspecting agent traces, debugging tool calls, and testing messages. Start it from any terminal—no Kaggle or notebook integration required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;adk web
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After launching, the UI becomes available at:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faov44s9s8m0ss02mm0d0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faov44s9s8m0ss02mm0d0.png" alt="agent UI" width="800" height="490"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Moving forward, my subsequent articles will cover:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Designing reliable tool schemas&lt;/li&gt;
&lt;li&gt;Structuring agent instructions&lt;/li&gt;
&lt;li&gt;Using Model Context Protocol (MCP) in real applications&lt;/li&gt;
&lt;li&gt;Implementing human-in-the-loop workflows&lt;/li&gt;
&lt;li&gt;Tracking performance and diagnosing failures&lt;/li&gt;
&lt;li&gt;Hardening agents against incorrect tool usage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's all for day 1! Can't wait to get back here for day 2!&lt;br&gt;
Did you know that the 5-Day AI agent Intensive Course is now publicly available to learn from? Head on &lt;a href="https://www.kaggle.com/learn-guide/5-day-agents" rel="noopener noreferrer"&gt;here&lt;/a&gt;! &lt;/p&gt;

&lt;p&gt;Let's connect:&lt;br&gt;
&lt;a href="//www.linkedin.com/in/dumebi-okolo"&gt;Linkedin&lt;/a&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>agents</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Migraine Awareness Week 2025: Living With New Daily Persistent Headache (NDPH)</title>
      <dc:creator>Dumebi Okolo</dc:creator>
      <pubDate>Wed, 24 Sep 2025 15:23:36 +0000</pubDate>
      <link>https://dev.to/dumebii/migraine-awareness-week-2025-living-with-new-daily-persistent-headache-ndph-2jip</link>
      <guid>https://dev.to/dumebii/migraine-awareness-week-2025-living-with-new-daily-persistent-headache-ndph-2jip</guid>
      <description>&lt;p&gt;September 22nd to September 28th marks Migraine Awareness Week 2025, a week dedicated to raising awareness about chronic headache conditions. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Dear Dev Community, I know that this isn't developer-focused content, but like the name of this app implies, this is a community. And I think part of the benefits of a community is to share your highs and lows. &lt;br&gt;
Today, I will be sharing a very big low of mine and hoping it reaches the right people and brings comfort to those who would need it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;While migraines are well-known, there's another, lesser-discussed disorder that deserves attention: New Daily Persistent Headache (NDPH).&lt;br&gt;
I've lived with NDPH since 2018 (meaning my head hasn't stoped aching for 7+ years), and this is my story.&lt;/p&gt;


&lt;h2&gt;
  
  
  How The Headaches Began
&lt;/h2&gt;

&lt;p&gt;In 2018, my life changed suddenly. One day, my head began to ache - and the pain never went away. It wasn't just the constant headache that puzzled me. I was also overwhelmed by intense hunger pangs that seemed impossible to satisfy.&lt;br&gt;
I thought I had an endocrine problem, so I kept visiting specialists. Instead of answers, I got brushed aside. Some doctors accused me of exaggerating, others said I was just seeking attention. Those words hurt deeply, and the lack of support made the condition even harder to bear.&lt;br&gt;
Meanwhile, the hunger took a toll. I gained weight year after year, carrying not only the burden of constant pain but also the visible effects of a body I couldn't control.&lt;br&gt;
The hunger is still there, but I feel that I have grown a lot since 2018, and I am more able to handle and control myself. From my estimate, I have gained about 50–60kg (100–120lbs) since 2018 and the inception of NDPH and the hunger pangs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq76sum3fpq31dvorlyqx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq76sum3fpq31dvorlyqx.png" alt="talking about NDPH"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  The Long Journey to Diagnosis
&lt;/h2&gt;

&lt;p&gt;Over the years, I tried almost everything - painkillers, anti-seizure drugs, blood pressure medication, endless consultations. Nothing helped.&lt;br&gt;
Two years ago, I finally met &lt;strong&gt;Professor Enoch&lt;/strong&gt;, a neurosurgeon who diagnosed me with New Daily Persistent Headache (NDPH). His verdict was blunt: there's no cure, only management.&lt;br&gt;
It was both crushing and liberating. Crushing because there was no solution, liberating because I finally had a name for my condition. I wasn't imagining it.&lt;/p&gt;


&lt;h2&gt;
  
  
  What Is New Daily Persistent Headache (NDPH)?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8vb8a58rjaz6tve4i0d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff8vb8a58rjaz6tve4i0d.png" alt="What is NDPH?"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;NDPH is a rare and stubborn primary headache disorder. Unlike migraines or tension headaches, it starts suddenly and without warning - many people remember the exact day it began - and then it becomes constant.&lt;br&gt;
Key features of NDPH include:&lt;br&gt;
Sudden onset: A headache that begins one day and never goes away.&lt;br&gt;
Duration: Lasts for more than three months with no break.&lt;br&gt;
Symptoms: Can resemble migraines (throbbing pain, sensitivity to light and sound, nausea) or tension headaches (tight, pressing pain).&lt;br&gt;
Causes: Still unclear. It may follow infections, stressful life events, or appear spontaneously.&lt;/p&gt;

&lt;p&gt;Unfortunately, NDPH is known for being difficult to treat, and many patients - myself included - go through years of trial and error with little relief.&lt;/p&gt;


&lt;h2&gt;
  
  
  Misdiagnosis and the Emotional Toll
&lt;/h2&gt;

&lt;p&gt;One of the toughest parts of this journey has been not being believed. Because NDPH is rare, most doctors don't recognise it right away. Instead, patients are bounced from one clinic to another, sometimes treated as if they're making it all up.&lt;br&gt;
This experience leaves scars. The pain itself is exhausting, but the dismissal from medical professionals adds another layer of suffering. For me, that rejection was almost as heavy as the headaches themselves.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6q3ed73bgrnh6w18k5l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6q3ed73bgrnh6w18k5l.png" alt="mis-diagnosing NDPH"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Learning to Manage Life With NDPH
&lt;/h2&gt;

&lt;p&gt;Today, I've stopped chasing miracle cures. I'm learning how to live with NDPH, and that shift in mindset has given me some peace.&lt;br&gt;
Here's what's helped me:&lt;br&gt;
Acceptance: Realizing the headache may never fully go away.&lt;br&gt;
Managing hunger: Finding strategies to control the relentless food cravings that come with my condition. I often go on fast periods to show/prove to my body that I, too, can win!&lt;br&gt;
Lifestyle adjustments: Pacing myself, managing stress, and prioritizing rest.&lt;br&gt;
Education and advocacy: Speaking openly about NDPH to raise awareness.&lt;/p&gt;

&lt;p&gt;It's not a perfect solution, but it's how I take back some control.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why Awareness Matters During Migraine Awareness Week
&lt;/h2&gt;

&lt;p&gt;NDPH isn't the same as migraine, but both are life-altering headache disorders that deserve compassion, understanding, and research. By sharing my story during Migraine Awareness Week 2025, I want to remind people:&lt;br&gt;
Chronic headaches are real, and they change lives.&lt;br&gt;
Patients deserve to be listened to, not dismissed.&lt;br&gt;
More research is urgently needed for conditions like NDPH.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkdtb4ju1bq3a6fmi8n6n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkdtb4ju1bq3a6fmi8n6n.png" alt="symptoms of NDPH"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Awareness won't cure me, but it can help shift how society responds to people living with invisible pain.&lt;/p&gt;


&lt;h2&gt;
  
  
  Does NDPH Count as a Disability?
&lt;/h2&gt;

&lt;p&gt;This is a question I often ask myself. On one hand, NDPH doesn't always show up on the outside, so people assume you're fine. But living with daily pain absolutely affects work, social life, and mental health in ways that can be disabling.&lt;br&gt;
Some organisations and countries recognise chronic headache disorders as disabilities, while others don't. For me, the label matters less than the recognition that this condition does make daily living harder. Still, it's an important question for society to wrestle with.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What do you think? Should NDPH be recognised as a disability?&lt;/strong&gt;&lt;/p&gt;



&lt;p&gt;✍️ Written in honour of Migraine Awareness Week 2025, and in hope of sparking conversation around New Daily Persistent Headache.&lt;/p&gt;



&lt;p&gt;  &lt;iframe src="https://www.youtube.com/embed/x41yiKp7c1E"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;Watching this explainer video will help. You might also notice my comment littered amongst the comments. 😂😂 ignore, please!&lt;/p&gt;




&lt;p&gt;All the Screenshots shared in this article are gotten from &lt;a href="https://my.clevelandclinic.org/health/diseases/24098-new-daily-persistent-headache-ndph" rel="noopener noreferrer"&gt;Cleveland Clinic's article on New Persistent Daily Headaches (NDPH)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is a one-off article. &lt;/p&gt;

</description>
      <category>watercooler</category>
      <category>mentalhealth</category>
      <category>wellbeing</category>
      <category>devhealth</category>
    </item>
  </channel>
</rss>
