{"id":59765,"date":"2026-03-11T14:33:32","date_gmt":"2026-03-11T18:33:32","guid":{"rendered":"https:\/\/codesamplez.com\/?p=59765"},"modified":"2026-03-11T14:33:34","modified_gmt":"2026-03-11T18:33:34","slug":"local-ai-coding-agent","status":"publish","type":"post","link":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent","title":{"rendered":"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude"},"content":{"rendered":"\n<p>If you&#8217;ve been curious about running a <strong>local LLM for coding<\/strong> but felt overwhelmed by the sheer number of models, runtimes, and configuration options, this guide is for you. Not the &#8220;tab-complete a for loop&#8221; kind of local. I&#8217;m talking full agentic coding \u2014 an AI agent that reads your repo, plans changes across multiple files, runs shell commands, executes tests, and iterates on errors. All on your machine. Zero API costs. Zero code leaving your network.<\/p>\n\n\n\n<p>The stack we&#8217;re building: <strong>Ollama + Claude Code + an open-weight model.<\/strong> It takes about 15 minutes to get running.<\/p>\n\n\n\n<div class=\"schema-faq wp-block-yoast-faq-block\"><div class=\"schema-faq-section\" id=\"faq-question-1773052601299\"><strong class=\"schema-faq-question\">What is a local LLM?<\/strong> <p class=\"schema-faq-answer\">A local LLM for coding is a large language model running entirely on your own hardware \u2014 laptop, workstation, or home server \u2014 that provides AI-powered code generation, refactoring, and agentic multi-file editing without sending a single line of code to the cloud. It matters because it gives you complete privacy, zero recurring costs, and AI coding assistance that works even without internet.<\/p> <\/div> <\/div>\n\n\n\n<p>\ud83d\udca1Pro Tip: New to agentic coding? Consider going over <a href=\"https:\/\/codesamplez.com\/productivity\/ai-coding-agent\" type=\"post\" id=\"57985\" target=\"_blank\" rel=\"noreferrer noopener\">beginners guide to ai coding assistant setup<\/a> first.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-why-local-llm-changes-everything-for-agentic-coding\">Why &#8220;Local LLM&#8221; Changes Everything for Agentic Coding<\/h2>\n\n\n\n<p>Before we set anything up, let me be specific about what changes when you move agentic coding off the cloud.<\/p>\n\n\n\n<p><strong>Privacy becomes real, not theoretical.<\/strong> Agentic coding tools read <em>everything<\/em> \u2014 your entire repo structure, environment files, config files, test output, shell history. With a cloud provider, all of that context ships to someone else&#8217;s servers. After Samsung&#8217;s engineers accidentally uploaded confidential source code to ChatGPT back in 2023, this stopped being a paranoia issue and became a policy issue at many companies. Running locally means your proprietary business logic stays on your machine. Period.<\/p>\n\n\n\n<p><strong>Cost drops to zero.<\/strong> Claude Code on Anthropic&#8217;s API is powerful but expensive \u2014 heavy agentic sessions with Opus can burn through $5-15\/hour easily. Locally? Your electricity bill goes up a few dollars, &#8230; for the entire month! The math is absurdly good once you have the hardware.<\/p>\n\n\n\n<p><strong>Availability becomes unconditional.<\/strong> Flights, coffee shops with garbage Wi-Fi, AWS outages that take down half the internet \u2014 none of it matters. Your AI coding agent is always on.<\/p>\n\n\n\n<p><strong>The quality tradeoff is shrinking fast.<\/strong> In 2023, local models were a joke for agentic work. In 2026, models like <a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3.5-35B-A3B\">Qwen 3.5<\/a> and <a href=\"https:\/\/ollama.com\/library\/glm-4.7-flash\">GLM-4.7-Flash<\/a> genuinely handle multi-file edits, tool calling, and long-context planning. They won&#8217;t match Opus 4.6 on the hardest tasks, but for building features, fixing bugs, and scaffolding projects? They&#8217;re shockingly competent.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-setting-expectations\">Setting Expectations<\/h3>\n\n\n\n<p>Here&#8217;s what nobody warns you about: <strong>a local model won&#8217;t feel exactly like Claude Opus or GPT-5 on day one.<\/strong> And that&#8217;s okay.<\/p>\n\n\n\n<p><strong>Local models are best at:<\/strong> scaffolding new features, generating boilerplate, writing tests, fixing bugs with clear error messages, and iterating on code with feedback loops. <\/p>\n\n\n\n<p><strong>They&#8217;re weaker at:<\/strong> massive architectural decisions across 20+ files, highly nuanced refactoring of complex legacy code, and tasks requiring enormous context windows. <\/p>\n\n\n\n<p>The sweet spot for local agentic coding is tasks where you can describe what you want, let the agent take a first pass, review, and iterate. That covers probably 80% of daily development work.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-choose-the-right-model\">Choose the Right Model<\/h2>\n\n\n\n<p>This is where most people get stuck. Hundreds of models on <a href=\"https:\/\/huggingface.co\" target=\"_blank\" rel=\"noreferrer noopener\">Hugging Face<\/a>, new ones every week, every Reddit thread recommending something different. Let me cut through the noise.<\/p>\n\n\n\n<p>For agentic coding, your model needs three things:<\/p>\n\n\n\n<p><strong>long context<\/strong> (~64K tokens minimum),<\/p>\n\n\n\n<p><strong>tool calling support<\/strong> (so the agent can execute commands, read files, run tests), and <\/p>\n\n\n\n<p><strong>strong instruction following<\/strong> (so it doesn&#8217;t go off the rails mid-task). Not every model delivers on all three.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-the-models-worth-running-right-now\">The Models Worth Running Right Now<\/h3>\n\n\n\n<p><strong><a href=\"https:\/\/huggingface.co\/Qwen\/Qwen3.5-35B-A3B\" target=\"_blank\" rel=\"noreferrer noopener\">Qwen 3.5 35B-A3B<\/a><\/strong> \u2014 My current daily driver. Released February 2026, this is a 35B parameter MoE model that only activates 3B parameters per token. That means it&#8217;s fast AND smart. It supports 256K context natively, has strong agentic capabilities, and its tool calling works reliably with Claude Code. Benchmarks back this up \u2014 the 35B-A3B model surpasses much larger predecessors like Qwen3-235B, as well as proprietary models like GPT-5 mini and Claude Sonnet 4.5 in categories including knowledge and visual reasoning. Runs comfortably on 32GB unified memory or a 24GB GPU. Apache 2.0 license, fully open for commercial use.<\/p>\n\n\n\n<p><strong><a href=\"https:\/\/ollama.com\/library\/glm-4.7-flash\" target=\"_blank\" rel=\"noreferrer noopener\">GLM-4.7-Flash<\/a><\/strong> \u2014 The best all-around model for 24GB VRAM setups. GLM-4.7-Flash dominates with a 30.1 Intelligence Index and won agentic coding challenges in recent independent testing. It handles planning, multi-step tool use, and code generation across multiple files with real consistency. If Qwen 3.5 35B doesn&#8217;t click for you, GLM-4.7-Flash is a rock-solid alternative.<\/p>\n\n\n\n<p><strong><a href=\"https:\/\/ollama.com\/library\/qwen3-coder:30b\" target=\"_blank\" rel=\"noreferrer noopener\">Qwen3-Coder 30B-A3B<\/a><\/strong> \u2014 Purpose-built for coding agents. It offers 30B total parameters with only 3.3B activated, with exceptional agentic capabilities for real-world software engineering tasks and native support for 256K tokens. Trained specifically on agentic coding workflows through reinforcement learning on SWE-Bench. If your work is purely code (no general reasoning, no docs), this specialist might outperform the generalists above.<\/p>\n\n\n\n<p><strong><a href=\"https:\/\/ollama.com\/library\/gpt-oss:20b\" target=\"_blank\" rel=\"noreferrer noopener\">GPT-OSS 20B<\/a><\/strong> \u2014 OpenAI&#8217;s open-weight model. Strong reasoning and tool calling capabilities. A solid option at ~13GB, it fits on more modest hardware while still handling agentic workflows.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-match-your-hardware-to-a-local-llm-model\">Match Your Hardware to a Local LLM Model<\/h3>\n\n\n\n<p>No CPU-only options here \u2014 agentic coding needs responsive inference, and CPU-only speeds are too slow for the multi-turn, tool-calling loops that Claude Code runs. You need a GPU or Apple Silicon unified memory.<\/p>\n\n\n\n<p>Here&#8217;s the straightforward mapping:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Your Hardware<\/th><th>Available Memory<\/th><th>Model to Run<\/th><th>Download Size<\/th><th>What to Expect<\/th><\/tr><\/thead><tbody><tr><td>Mac M1\/M2\/M3 with 16GB unified<\/td><td>~14GB usable<\/td><td>GPT-OSS 20B (Q4)<\/td><td>~13GB<\/td><td>Workable but tight. Close other apps.<\/td><\/tr><tr><td>Mac M2\/M3\/M4 with 32GB unified<\/td><td>~28GB usable<\/td><td>Qwen 3.5 35B-A3B (Q4)<\/td><td>~22GB<\/td><td>Smooth daily driver. My recommended starting point.<\/td><\/tr><tr><td>Mac M2\/M3\/M4 with 48-64GB unified<\/td><td>~44-58GB usable<\/td><td>Qwen 3.5 35B-A3B (Q8) or 122B-A10B (Q4)<\/td><td>~35GB \/ ~70GB<\/td><td>Premium experience, larger context windows.<\/td><\/tr><tr><td>NVIDIA GPU with 16GB VRAM<\/td><td>16GB<\/td><td>GPT-OSS 20B (Q4)<\/td><td>~13GB<\/td><td>Solid agentic coding for most tasks.<\/td><\/tr><tr><td>NVIDIA GPU with 24GB VRAM<\/td><td>24GB<\/td><td>GLM-4.7-Flash (Q4) or Qwen 3.5 35B-A3B (Q4)<\/td><td>~18-22GB<\/td><td>Excellent. This is the NVIDIA sweet spot.<\/td><\/tr><tr><td>NVIDIA GPU with 48GB+ VRAM<\/td><td>48GB+<\/td><td>Qwen 3.5 122B-A10B (Q4)<\/td><td>~70GB<\/td><td>Near-cloud quality, serious agentic power.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>The quick rule of thumb:<\/strong> Take your available memory, subtract 2-4GB for overhead, and that&#8217;s your model size budget at Q4 quantization.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><strong>What&#8217;s Q4 quantization?<\/strong> Quantization compresses model weights from 16-bit floats to smaller integers, dramatically reducing memory usage. Q4_K_M is the sweet spot \u2014 it cuts memory by ~75% with minimal quality loss. Below Q3, quality degrades noticeably. Ollama handles quantization automatically when you pull a model, so you don&#8217;t need to worry about the details.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-setup-guide-ollama-claude-code\">Setup Guide: Ollama + Claude Code<\/h2>\n\n\n\n<p>One path. No forks. Let&#8217;s get a working local AI coding agent in 15 minutes.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-step-1-install-ollama\">Step 1: Install Ollama<\/h3>\n\n\n\n<p><a href=\"https:\/\/ollama.com\" target=\"_blank\" rel=\"noreferrer noopener\">Ollama<\/a> handles model downloads, GPU detection, quantization, and serves an API that Claude Code talks to. It&#8217;s the foundation.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># macOS (Homebrew)\nbrew install ollama\n\n# Linux\ncurl -fsSL https:\/\/ollama.ai\/install.sh | sh\n\n# Windows \u2014 download installer from https:\/\/ollama.com<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #616E88\"># macOS (Homebrew)<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">brew<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">install<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">ollama<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Linux<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">curl<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">-fsSL<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">https:\/\/ollama.ai\/install.sh<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">|<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">sh<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Windows \u2014 download installer from https:\/\/ollama.com<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>Verify it&#8217;s working:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>ollama --version\nollama serve    # starts the background server (may already be running)<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">--version<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">serve<\/span><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #616E88\"># starts the background server (may already be running)<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p><em>Alt text: Terminal commands showing Ollama installation on macOS, Linux, and Windows with version verification<\/em><\/p>\n\n\n\n<p><strong>Important:<\/strong> Make sure you&#8217;re on Ollama v0.14.0 or later. In January 2026, Ollama added support for the Anthropic Messages API, enabling Claude Code to connect directly to any Ollama model. Older versions won&#8217;t work with Claude Code.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-step-2-pull-a-coding-model\">Step 2: Pull a Coding Model<\/h3>\n\n\n\n<p>Pick a model from the recommendations above based on your hardware. Here are the pull commands:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(2 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># For 32GB+ unified memory or 24GB VRAM (recommended starting point):\nollama pull qwen3.5:35b-a3b\n\n# Alternative for 24GB VRAM:\nollama pull glm-4.7-flash\n\n# For coding-specific workloads:\nollama pull qwen3-coder:30b\n\n# For 16GB setups:\nollama pull gpt-oss:20b<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #616E88\"># For 32GB+ unified memory or 24GB VRAM (recommended starting point):<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">pull<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">qwen3.5:35b-a3b<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Alternative for 24GB VRAM:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">pull<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">glm-4.7-flash<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># For coding-specific workloads:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">pull<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">qwen3-coder:30b<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># For 16GB setups:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">pull<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">gpt-oss:20b<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>Quick sanity check \u2014 run it interactively to confirm it works:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>ollama run qwen3.5:35b-a3b \"Write a Python function that finds the longest palindromic substring\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">run<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">qwen3.5:35b-a3b<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">Write a Python function that finds the longest palindromic substring<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>If you get a sensible response, you&#8217;re golden. If it&#8217;s painfully slow or crashes, you need a smaller model for your hardware. Type <code>\/bye<\/code> to exit the interactive session.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-step-3-install-claude-code\">Step 3: Install Claude Code<\/h3>\n\n\n\n<p>Claude Code is Anthropic&#8217;s terminal-based agentic coding tool. It can read your repo, plan changes, edit files, run commands, and iterate \u2014 all from your terminal. And thanks to Ollama&#8217;s Anthropic API compatibility, it works with local models.<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># macOS \/ Linux \/ WSL\ncurl -fsSL https:\/\/claude.ai\/install.sh | bash\n\n# Windows CMD\ncurl -fsSL https:\/\/claude.ai\/install.cmd -o install.cmd &amp;&amp; install.cmd &amp;&amp; del install.cmd<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #616E88\"># macOS \/ Linux \/ WSL<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">curl<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">-fsSL<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">https:\/\/claude.ai\/install.sh<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #81A1C1\">|<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">bash<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Windows CMD<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">curl<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">-fsSL<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">https:\/\/claude.ai\/install.cmd<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">-o<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">install.cmd<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&amp;&amp;<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">install.cmd<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&amp;&amp;<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #88C0D0\">del<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">install.cmd<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>Verify the installation:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>claude --version<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">claude<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">--version<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-step-4-connect-claude-code-to-ollama\">Step 4: Connect Claude Code to Ollama<\/h3>\n\n\n\n<p>This is the key step. You need to tell Claude Code to talk to your local Ollama server instead of Anthropic&#8217;s cloud API.<\/p>\n\n\n\n<p><strong>Option A: The one-liner (easiest)<\/strong><\/p>\n\n\n\n<p>If your Ollama is up to date, this single command handles everything:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>ollama launch claude --model qwen3.5:35b-a3b<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">ollama<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">launch<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">claude<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">--model<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">qwen3.5:35b-a3b<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>That&#8217;s it. Ollama sets the environment variables and launches Claude Code pointed at your local model. \ud83c\udf89<\/p>\n\n\n\n<p><strong>Option B: Manual environment variables (if <code>ollama launch<\/code> isn&#8217;t available)<\/strong><\/p>\n\n\n\n<p>Add these to your <code>~\/.bashrc<\/code>, <code>~\/.zshrc<\/code>, or run them before launching Claude Code:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>export ANTHROPIC_BASE_URL=\"http:\/\/localhost:11434\"\nexport ANTHROPIC_AUTH_TOKEN=\"ollama\"\nexport ANTHROPIC_API_KEY=\"\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">export<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">ANTHROPIC_BASE_URL<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">http:\/\/localhost:11434<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">export<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">ANTHROPIC_AUTH_TOKEN<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">ollama<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">export<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #D8DEE9\">ANTHROPIC_API_KEY<\/span><span style=\"color: #81A1C1\">=<\/span><span style=\"color: #ECEFF4\">&quot;&quot;<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>Then launch Claude Code in your project directory:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>cd \/path\/to\/your\/project\nclaude<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">cd<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">\/path\/to\/your\/project<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">claude<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p><strong>Option C: Persistent config (recommended for daily use)<\/strong><\/p>\n\n\n\n<p>Add the settings to Claude Code&#8217;s config file at <code>~\/.claude\/settings.json<\/code>:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>{\n  \"env\": {\n    \"ANTHROPIC_BASE_URL\": \"http:\/\/localhost:11434\",\n    \"ANTHROPIC_AUTH_TOKEN\": \"ollama\",\n    \"ANTHROPIC_API_KEY\": \"\"\n  }\n}<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #ECEFF4\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #88C0D0\">&quot;env&quot;<\/span><span style=\"color: #88C0D0\">:<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">{<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">&quot;ANTHROPIC_BASE_URL&quot;<\/span><span style=\"color: #88C0D0\">:<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">http:\/\/localhost:11434<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">&quot;ANTHROPIC_AUTH_TOKEN&quot;<\/span><span style=\"color: #88C0D0\">:<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">ollama<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">,<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">    <\/span><span style=\"color: #88C0D0\">&quot;ANTHROPIC_API_KEY&quot;<\/span><span style=\"color: #88C0D0\">:<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;&quot;<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #ECEFF4\">}<\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">}<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>Now every time you run <code>claude<\/code>, it&#8217;ll automatically connect to Ollama. Switch models anytime with:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>claude --model qwen3.5:35b-a3b\nclaude --model glm-4.7-flash\nclaude --model qwen3-coder:30b<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">claude<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">--model<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">qwen3.5:35b-a3b<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">claude<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">--model<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">glm-4.7-flash<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">claude<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">--model<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">qwen3-coder:30b<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p><em>Alt text: Three methods for connecting Claude Code to Ollama \u2014 one-liner, environment variables, and persistent config file<\/em><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-step-5-test-it-on-a-real-task\">Step 5: Test It On a Real Task<\/h3>\n\n\n\n<p>Navigate to a project directory and give Claude Code a real job:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>cd ~\/projects\/my-app\nclaude\n\n# Inside Claude Code, try:\n> Add input validation to the user registration endpoint. \n  Write tests for the validation logic. Run the tests and fix any failures.<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">cd<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">~\/projects\/my-app<\/span><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">claude<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #616E88\"># Inside Claude Code, try:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">&gt;<\/span><span style=\"color: #D8DEE9FF\"> Add input validation to the user registration endpoint. <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">  <\/span><span style=\"color: #88C0D0\">Write<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">tests<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">for<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">the<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">validation<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">logic.<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">Run<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">the<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">tests<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">and<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">fix<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">any<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #A3BE8C\">failures.<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p>Watch it work. Claude Code will read your codebase, plan the changes, edit files, run your test suite, and fix issues \u2014 all powered by your local model. No tokens leave your machine. No API bill at the end. \ud83d\ude80<\/p>\n\n\n\n<p><strong>If things feel slow:<\/strong> The first prompt after loading a model takes longer (cold start). Subsequent prompts are much faster. You can keep models loaded longer by setting <code>OLLAMA_KEEP_ALIVE=30m<\/code> (or <code>-1<\/code> for indefinitely).<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-make-local-llm-based-agentic-coding-feel-good\">Make Local LLM based Agentic Coding Feel Good<\/h2>\n\n\n\n<p>Getting the agent running is step one. Getting it to produce code you actually want to use? That&#8217;s the craft. Here&#8217;s what I&#8217;ve learned through hundreds of hours of local agentic coding.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-prompt-patterns-that-work\">Prompt Patterns That Work<\/h3>\n\n\n\n<p>Local models have smaller effective context windows than cloud models. Every wasted token costs you speed. These patterns make the most of what you have.<\/p>\n\n\n\n<p><strong>&#8220;Plan first, code second.&#8221;<\/strong> Instead of asking the agent to immediately start editing files, ask it to outline its approach. This catches wrong assumptions before they waste 200 lines of generation:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>\u274c \"Add Redis caching to the user service\"\n\n\u2705 \"I need Redis caching for the user service.\n   Before writing code:\n   1. List your assumptions about the existing architecture\n   2. Describe the approach in 3-4 bullets\n   3. Then implement it\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #88C0D0\">\u274c<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">Add Redis caching to the user service<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #88C0D0\">\u2705<\/span><span style=\"color: #D8DEE9FF\"> <\/span><span style=\"color: #ECEFF4\">&quot;<\/span><span style=\"color: #A3BE8C\">I need Redis caching for the user service.<\/span><\/span>\n<span class=\"line\"><span style=\"color: #A3BE8C\">   Before writing code:<\/span><\/span>\n<span class=\"line\"><span style=\"color: #A3BE8C\">   1. List your assumptions about the existing architecture<\/span><\/span>\n<span class=\"line\"><span style=\"color: #A3BE8C\">   2. Describe the approach in 3-4 bullets<\/span><\/span>\n<span class=\"line\"><span style=\"color: #A3BE8C\">   3. Then implement it<\/span><span style=\"color: #ECEFF4\">&quot;<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Bash<\/span><\/div>\n\n\n\n<p><strong>Use a CLAUDE.md file.<\/strong> Drop a <code>CLAUDE.md<\/code> in your project root with context about your stack, conventions, and testing commands. Claude Code reads this automatically. This single file replaces a ton of repeated prompt context:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(2 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly># CLAUDE.md\n## Stack\n- Python 3.12, FastAPI, SQLAlchemy 2.0, PostgreSQL\n- Tests: pytest, run with `pytest tests\/ -v`\n- Linting: `ruff check .`\n\n## Conventions\n- Type hints on all function signatures\n- Docstrings on public functions\n- Tests go in tests\/ mirroring src\/ structure\n\n## Important\n- Never modify alembic migration files\n- The auth middleware in src\/auth\/middleware.py is security-critical \u2014 ask before changing<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #81A1C1\">#<\/span><span style=\"color: #88C0D0\"> CLAUDE.md<\/span><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">##<\/span><span style=\"color: #88C0D0\"> Stack<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Python 3.12, FastAPI, SQLAlchemy 2.0, PostgreSQL<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Tests: pytest, run with <\/span><span style=\"color: #8FBCBB\">`pytest tests\/ -v`<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Linting: <\/span><span style=\"color: #8FBCBB\">`ruff check .`<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">##<\/span><span style=\"color: #88C0D0\"> Conventions<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Type hints on all function signatures<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Docstrings on public functions<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Tests go in tests\/ mirroring src\/ structure<\/span><\/span>\n<span class=\"line\"><\/span>\n<span class=\"line\"><span style=\"color: #81A1C1\">##<\/span><span style=\"color: #88C0D0\"> Important<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> Never modify alembic migration files<\/span><\/span>\n<span class=\"line\"><span style=\"color: #ECEFF4\">-<\/span><span style=\"color: #D8DEE9FF\"> The auth middleware in src\/auth\/middleware.py is security-critical \u2014 ask before changing<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Markdown<\/span><\/div>\n\n\n\n<p><strong>Constrain the scope.<\/strong> Local models can wander. Be explicit about boundaries:<\/p>\n\n\n\n<div class=\"wp-block-kevinbatdorf-code-block-pro padding-bottom-disabled cbp-has-line-numbers\" data-code-block-pro-font-family=\"Code-Pro-JetBrains-Mono\" style=\"font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#d8dee9ff;--cbp-line-number-width:calc(1 * 0.6 * .875rem);line-height:1.25rem;--cbp-tab-width:2;tab-size:var(--cbp-tab-width, 2)\"><span role=\"button\" tabindex=\"0\" style=\"color:#d8dee9ff;display:none\" aria-label=\"Copy\" class=\"code-block-pro-copy-button\"><pre class=\"code-block-pro-copy-button-pre\" aria-hidden=\"true\"><textarea class=\"code-block-pro-copy-button-textarea\" tabindex=\"-1\" aria-hidden=\"true\" readonly>\u2705 \"Fix the failing test in tests\/test_users.py. \n   Only modify src\/users\/service.py and the test file itself. \n   Don't touch any other files.\"<\/textarea><\/pre><svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" style=\"width:24px;height:24px\" fill=\"none\" viewBox=\"0 0 24 24\" stroke=\"currentColor\" stroke-width=\"2\"><path class=\"with-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M4.5 12.75l6 6 9-13.5\"><\/path><path class=\"without-check\" stroke-linecap=\"round\" stroke-linejoin=\"round\" d=\"M16.5 8.25V6a2.25 2.25 0 00-2.25-2.25H6A2.25 2.25 0 003.75 6v8.25A2.25 2.25 0 006 16.5h2.25m8.25-8.25H18a2.25 2.25 0 012.25 2.25V18A2.25 2.25 0 0118 20.25h-7.5A2.25 2.25 0 018.25 18v-1.5m8.25-8.25h-6a2.25 2.25 0 00-2.25 2.25v6\"><\/path><\/svg><\/span><pre class=\"shiki nord\" style=\"background-color: #2e3440ff\" tabindex=\"0\"><code><span class=\"line\"><span style=\"color: #D8DEE9FF\">\u2705 &quot;Fix the failing test in tests\/test_users.py. <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">   Only modify src\/users\/service.py and the test file itself. <\/span><\/span>\n<span class=\"line\"><span style=\"color: #D8DEE9FF\">   Don&#39;t touch any other files.&quot;<\/span><\/span><\/code><\/pre><span style=\"display:flex;align-items:flex-end;padding:10px;width:100%;justify-content:flex-end;background-color:#2e3440ff;color:#c8d0e0;font-size:12px;line-height:1;position:relative\">Markdown<\/span><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-quality-controls-your-safety-net\">Quality Controls (Your Safety Net)<\/h3>\n\n\n\n<p>Local models hallucinate. Smaller models hallucinate more than larger ones. These guardrails have saved me countless hours.<\/p>\n\n\n\n<p><strong>Always require tests.<\/strong> If the agent generates a function, it should also generate the test. If the test fails, it should fix the code. This feedback loop catches the majority of hallucinated APIs and wrong assumptions. Claude Code does this naturally when you ask \u2014 lean into it.<\/p>\n\n\n\n<p><strong>Keep your CLAUDE.md&#8217;s test command current.<\/strong> If Claude Code knows how to run your tests (<code>pytest<\/code>, <code>npm test<\/code>, <code>cargo test<\/code>), it&#8217;ll run them automatically after making changes and self-correct failures. This is the single most impactful quality improvement you can make.<\/p>\n\n\n\n<p><strong>Don&#8217;t trust, verify.<\/strong> Review every change before committing. I treat local LLM output the same way I treat a PR from a new team member \u2014 review it, test it, then merge it.<\/p>\n\n\n\n<p><strong>Watch for hallucinated imports.<\/strong> The most common failure mode: the model imports a function or library that doesn&#8217;t exist. Including your <code>package.json<\/code> or <code>requirements.txt<\/code> context (via CLAUDE.md or direct mention) reduces this significantly.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-troubleshooting\">Troubleshooting<\/h2>\n\n\n\n<p>I&#8217;ve hit every one of these problems. Here are the fixes.<\/p>\n\n\n\n<p><strong>Claude Code says &#8220;connection refused&#8221;:<\/strong> Ollama isn&#8217;t running. Start it with <code>ollama serve<\/code> or check that the Ollama app is open. Verify with <code>curl http:\/\/localhost:11434<\/code> \u2014 you should see &#8220;Ollama is running.&#8221;<\/p>\n\n\n\n<p><strong>Model is painfully slow or hangs:<\/strong> Your model is too large for your hardware. Run <code>ollama ps<\/code> to check memory usage. If the model is spilling to CPU (you&#8217;ll see partial GPU offload), try a smaller model. Also check that no other heavy processes are eating your VRAM (<code>nvidia-smi<\/code> on NVIDIA, Activity Monitor on Mac).<\/p>\n\n\n\n<p><strong>Claude Code errors with &#8220;model not found&#8221;:<\/strong> The model name must exactly match what Ollama has. Run <code>ollama list<\/code> and use the exact name shown. Common mistake: pulling <code>qwen3.5<\/code> but specifying <code>qwen3.5:35b-a3b<\/code> (or vice versa).<\/p>\n\n\n\n<p><strong>First prompt takes forever, then it&#8217;s fine:<\/strong> This is the cold start \u2014 model weights loading into memory. Set <code>OLLAMA_KEEP_ALIVE=30m<\/code> to keep the model loaded for 30 minutes between requests. For all-day coding, use <code>OLLAMA_KEEP_ALIVE=-1<\/code>.<\/p>\n\n\n\n<p><strong>Agent makes changes to wrong files or goes off-script:<\/strong> Your context window might be overflowing. Reduce the scope of your prompts, use a CLAUDE.md to establish boundaries, and break large tasks into smaller steps. Also try a model with better agentic training \u2014 Qwen3-Coder 30B was specifically RL-trained for multi-step coding tasks.<\/p>\n\n\n\n<p><strong>Hallucinated APIs \u2014 model suggests functions that don&#8217;t exist:<\/strong> Include your dependency files in context. Add to your CLAUDE.md: &#8220;Only use APIs from dependencies listed in package.json \/ requirements.txt.&#8221; Run type-checking (<code>tsc --noEmit<\/code>, <code>mypy<\/code>, <code>pyright<\/code>) on generated code as a filter.<\/p>\n\n\n\n<p><strong><code>ollama launch claude<\/code> returns &#8220;unknown command&#8221;:<\/strong> Your Ollama version is too old. Update Ollama \u2014 the <code>launch<\/code> command requires v0.14.0+. After updating, restart Ollama.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-privacy-and-security-checklist\">Privacy and Security Checklist<\/h2>\n\n\n\n<p>Running locally doesn&#8217;t automatically make you bulletproof. Here&#8217;s what to lock down.<\/p>\n\n\n\n<p><strong>Disable VS Code \/ editor telemetry.<\/strong> If you&#8217;re also using an IDE alongside Claude Code, many extensions phone home with usage data. Set <code>\"telemetry.telemetryLevel\": \"off\"<\/code> in VS Code settings.<\/p>\n\n\n\n<p><strong>Keep Ollama on localhost.<\/strong> Ollama binds to <code>127.0.0.1:11434<\/code> by default \u2014 that&#8217;s safe. If you change it to <code>0.0.0.0:11434<\/code> for remote access, anyone on your network can hit the API. Use a firewall or bind to a specific interface.<\/p>\n\n\n\n<p><strong>Disable nonessential traffic.<\/strong> Add <code>CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1<\/code> to your environment variables. This prevents Claude Code from sending any analytics or telemetry.<\/p>\n\n\n\n<p><strong>Enable disk encryption.<\/strong> Conversation history and model weights live on your disk. FileVault (Mac), LUKS (Linux), or BitLocker (Windows) protect them if your machine is lost or stolen.<\/p>\n\n\n\n<p><strong>Check your model license.<\/strong> Most models covered here (Qwen 3.5, GLM-4.7, Qwen3-Coder) use Apache 2.0 or similarly permissive licenses \u2014 free for commercial use, no strings attached. DeepSeek models have a revenue threshold. Always check the model card on Hugging Face before deploying in a commercial context.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-faq\">FAQ<\/h2>\n\n\n\n<div class=\"schema-faq wp-block-yoast-faq-block\"><div class=\"schema-faq-section\" id=\"faq-question-1773252309058\"><strong class=\"schema-faq-question\"><strong>What is the best local LLM for agentic coding in 2026?<\/strong><\/strong> <p class=\"schema-faq-answer\">For most developers: Qwen 3.5 35B-A3B. It balances strong agentic capabilities, long context (256K), tool calling support, and hardware efficiency (only 3B active parameters). On a 24GB NVIDIA GPU specifically, GLM-4.7-Flash is equally strong. Both run on Claude Code via Ollama without issues.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1773252321980\"><strong class=\"schema-faq-question\"><strong>How much VRAM\/RAM do I actually need?<\/strong><\/strong> <p class=\"schema-faq-answer\">16GB is the minimum for a usable experience. 24GB VRAM or 32GB unified memory is the sweet spot where agentic coding starts to feel genuinely productive. Below 16GB, the models that support proper tool calling and long context won&#8217;t fit.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1773252345429\"><strong class=\"schema-faq-question\"><strong>Can I use this without a GPU at all?<\/strong><\/strong> <p class=\"schema-faq-answer\">Agentic coding requires responsive multi-turn inference \u2014 the agent makes many sequential calls to the model during a single task. CPU-only inference is too slow for this workflow. You need a dedicated GPU or Apple Silicon unified memory.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1773252354960\"><strong class=\"schema-faq-question\"><strong>Are local LLMs actually free?<\/strong><\/strong> <p class=\"schema-faq-answer\">Yes. Ollama is free, Claude Code is free to run against non-Anthropic endpoints, and the open-weight models are free to download and use. Your only costs are hardware (which you likely already own) and electricity.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1773252367755\"><strong class=\"schema-faq-question\"><strong>How does this compare to using Claude Code with Anthropic&#8217;s API?<\/strong><\/strong> <p class=\"schema-faq-answer\">Anthropic&#8217;s cloud models (Opus 4.6, Sonnet 4.6) are still more capable on the hardest tasks. But for 80% of daily coding work \u2014 building features, fixing bugs, writing tests, scaffolding projects \u2014 a good local model is more than sufficient. And it&#8217;s free, private, and always available.<\/p> <\/div> <div class=\"schema-faq-section\" id=\"faq-question-1773252380854\"><strong class=\"schema-faq-question\"><strong>Can I switch between local and cloud models?<\/strong><\/strong> <p class=\"schema-faq-answer\">Yes. Remove the environment variables (or use a separate terminal profile) and Claude Code goes back to Anthropic&#8217;s API. Many developers use local for routine work and cloud for complex tasks. It&#8217;s a pragmatic setup.<\/p> <\/div> <\/div>\n","protected":false},"excerpt":{"rendered":"<p>This guide walks you through setting up a fully private, local LLM for coding on your own hardware. From model selection and hardware planning to IDE integration with Ollama and Continue.dev, you&#8217;ll build an AI coding stack where no code ever leaves your machine.<\/p>\n","protected":false},"author":1,"featured_media":59776,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_genesis_hide_title":false,"_genesis_hide_breadcrumbs":false,"_genesis_hide_singular_image":false,"_genesis_hide_footer_widgets":false,"_genesis_custom_body_class":"","_genesis_custom_post_class":"","_genesis_layout":"","jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[24],"tags":[3497,3472],"class_list":{"0":"post-59765","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-productivity","8":"tag-agent","9":"tag-ai","10":"entry"},"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.4 (Yoast SEO v27.4) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Local LLM for Coding: Free AI Coding Agent With Ollama + Claude - CodeSamplez.com<\/title>\n<meta name=\"description\" content=\"Learn how to set up a local LLM for coding with Ollama, Continue.dev, and VS Code. Covers model selection, hardware planning + IDE integration\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude\" \/>\n<meta property=\"og:description\" content=\"Learn how to set up a local LLM for coding with Ollama, Continue.dev, and VS Code. Covers model selection, hardware planning + IDE integration\" \/>\n<meta property=\"og:url\" content=\"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent\" \/>\n<meta property=\"og:site_name\" content=\"CodeSamplez.com\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/codesamplez\" \/>\n<meta property=\"article:author\" content=\"https:\/\/www.facebook.com\/ranacseruet\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-11T18:33:32+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-11T18:33:34+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1536\" \/>\n\t<meta property=\"og:image:height\" content=\"1024\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"Rana Ahsan\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@ranacseruet\" \/>\n<meta name=\"twitter:site\" content=\"@codesamplez\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Rana Ahsan\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":[\"Article\",\"BlogPosting\"],\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent\"},\"author\":{\"name\":\"Rana Ahsan\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#\\\/schema\\\/person\\\/a82c3c07205f4bb73d6b3b0906bc328b\"},\"headline\":\"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude\",\"datePublished\":\"2026-03-11T18:33:32+00:00\",\"dateModified\":\"2026-03-11T18:33:34+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent\"},\"wordCount\":2624,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/codesamplez.com\\\/wp-content\\\/uploads\\\/2026\\\/03\\\/local-coding-agent.webp\",\"keywords\":[\"Agent\",\"AI\"],\"articleSection\":[\"Productivity\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#respond\"]}]},{\"@type\":[\"WebPage\",\"FAQPage\"],\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent\",\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent\",\"name\":\"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude - CodeSamplez.com\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/codesamplez.com\\\/wp-content\\\/uploads\\\/2026\\\/03\\\/local-coding-agent.webp\",\"datePublished\":\"2026-03-11T18:33:32+00:00\",\"dateModified\":\"2026-03-11T18:33:34+00:00\",\"description\":\"Learn how to set up a local LLM for coding with Ollama, Continue.dev, and VS Code. Covers model selection, hardware planning + IDE integration\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#breadcrumb\"},\"mainEntity\":[{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773052601299\"},{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252309058\"},{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252321980\"},{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252345429\"},{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252354960\"},{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252367755\"},{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252380854\"}],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#primaryimage\",\"url\":\"https:\\\/\\\/codesamplez.com\\\/wp-content\\\/uploads\\\/2026\\\/03\\\/local-coding-agent.webp\",\"contentUrl\":\"https:\\\/\\\/codesamplez.com\\\/wp-content\\\/uploads\\\/2026\\\/03\\\/local-coding-agent.webp\",\"width\":1536,\"height\":1024,\"caption\":\"Local Coding Agent\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/codesamplez.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#website\",\"url\":\"https:\\\/\\\/codesamplez.com\\\/\",\"name\":\"CODESAMPLEZ.COM\",\"description\":\"Programming And Development Resources\",\"publisher\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/codesamplez.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#organization\",\"name\":\"codesamplez.com\",\"url\":\"https:\\\/\\\/codesamplez.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/codesamplez.com\\\/wp-content\\\/uploads\\\/2024\\\/10\\\/cropped-favicon.webp\",\"contentUrl\":\"https:\\\/\\\/codesamplez.com\\\/wp-content\\\/uploads\\\/2024\\\/10\\\/cropped-favicon.webp\",\"width\":512,\"height\":512,\"caption\":\"codesamplez.com\"},\"image\":{\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/codesamplez\",\"https:\\\/\\\/x.com\\\/codesamplez\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/#\\\/schema\\\/person\\\/a82c3c07205f4bb73d6b3b0906bc328b\",\"name\":\"Rana Ahsan\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5c7a4f88bcf4a55cd1483386318ebecf27359154275a0b355b0ea186676f9f7f?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5c7a4f88bcf4a55cd1483386318ebecf27359154275a0b355b0ea186676f9f7f?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/5c7a4f88bcf4a55cd1483386318ebecf27359154275a0b355b0ea186676f9f7f?s=96&d=mm&r=g\",\"caption\":\"Rana Ahsan\"},\"description\":\"Rana Ahsan is a seasoned software engineer and technology leader specialized in distributed systems and software architecture. With a Master\u2019s in Software Engineering from Concordia University, his experience spans leading scalable architecture at Coursera and TopHat, contributing to open-source projects. This blog, CodeSamplez.com, showcases his passion for sharing practical insights on programming and distributed systems concepts and help educate others. Github | X | LinkedIn\",\"sameAs\":[\"https:\\\/\\\/github.com\\\/ranacseruet\",\"https:\\\/\\\/www.facebook.com\\\/ranacseruet\",\"https:\\\/\\\/www.linkedin.com\\\/in\\\/ranacseruet\\\/\",\"https:\\\/\\\/x.com\\\/ranacseruet\"],\"url\":\"https:\\\/\\\/codesamplez.com\\\/author\\\/admin\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773052601299\",\"position\":1,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773052601299\",\"name\":\"What is a local LLM?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"A local LLM for coding is a large language model running entirely on your own hardware \u2014 laptop, workstation, or home server \u2014 that provides AI-powered code generation, refactoring, and agentic multi-file editing without sending a single line of code to the cloud. It matters because it gives you complete privacy, zero recurring costs, and AI coding assistance that works even without internet.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252309058\",\"position\":2,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252309058\",\"name\":\"What is the best local LLM for agentic coding in 2026?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"For most developers: Qwen 3.5 35B-A3B. It balances strong agentic capabilities, long context (256K), tool calling support, and hardware efficiency (only 3B active parameters). On a 24GB NVIDIA GPU specifically, GLM-4.7-Flash is equally strong. Both run on Claude Code via Ollama without issues.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252321980\",\"position\":3,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252321980\",\"name\":\"How much VRAM\\\/RAM do I actually need?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"16GB is the minimum for a usable experience. 24GB VRAM or 32GB unified memory is the sweet spot where agentic coding starts to feel genuinely productive. Below 16GB, the models that support proper tool calling and long context won't fit.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252345429\",\"position\":4,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252345429\",\"name\":\"Can I use this without a GPU at all?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Agentic coding requires responsive multi-turn inference \u2014 the agent makes many sequential calls to the model during a single task. CPU-only inference is too slow for this workflow. You need a dedicated GPU or Apple Silicon unified memory.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252354960\",\"position\":5,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252354960\",\"name\":\"Are local LLMs actually free?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Yes. Ollama is free, Claude Code is free to run against non-Anthropic endpoints, and the open-weight models are free to download and use. Your only costs are hardware (which you likely already own) and electricity.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252367755\",\"position\":6,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252367755\",\"name\":\"How does this compare to using Claude Code with Anthropic's API?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Anthropic's cloud models (Opus 4.6, Sonnet 4.6) are still more capable on the hardest tasks. But for 80% of daily coding work \u2014 building features, fixing bugs, writing tests, scaffolding projects \u2014 a good local model is more than sufficient. And it's free, private, and always available.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"},{\"@type\":\"Question\",\"@id\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252380854\",\"position\":7,\"url\":\"https:\\\/\\\/codesamplez.com\\\/productivity\\\/local-ai-coding-agent#faq-question-1773252380854\",\"name\":\"Can I switch between local and cloud models?\",\"answerCount\":1,\"acceptedAnswer\":{\"@type\":\"Answer\",\"text\":\"Yes. Remove the environment variables (or use a separate terminal profile) and Claude Code goes back to Anthropic's API. Many developers use local for routine work and cloud for complex tasks. It's a pragmatic setup.\",\"inLanguage\":\"en-US\"},\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude - CodeSamplez.com","description":"Learn how to set up a local LLM for coding with Ollama, Continue.dev, and VS Code. Covers model selection, hardware planning + IDE integration","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent","og_locale":"en_US","og_type":"article","og_title":"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude","og_description":"Learn how to set up a local LLM for coding with Ollama, Continue.dev, and VS Code. Covers model selection, hardware planning + IDE integration","og_url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent","og_site_name":"CodeSamplez.com","article_publisher":"https:\/\/www.facebook.com\/codesamplez","article_author":"https:\/\/www.facebook.com\/ranacseruet","article_published_time":"2026-03-11T18:33:32+00:00","article_modified_time":"2026-03-11T18:33:34+00:00","og_image":[{"width":1536,"height":1024,"url":"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp","type":"image\/webp"}],"author":"Rana Ahsan","twitter_card":"summary_large_image","twitter_creator":"@ranacseruet","twitter_site":"@codesamplez","twitter_misc":{"Written by":"Rana Ahsan","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":["Article","BlogPosting"],"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#article","isPartOf":{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent"},"author":{"name":"Rana Ahsan","@id":"https:\/\/codesamplez.com\/#\/schema\/person\/a82c3c07205f4bb73d6b3b0906bc328b"},"headline":"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude","datePublished":"2026-03-11T18:33:32+00:00","dateModified":"2026-03-11T18:33:34+00:00","mainEntityOfPage":{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent"},"wordCount":2624,"commentCount":0,"publisher":{"@id":"https:\/\/codesamplez.com\/#organization"},"image":{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#primaryimage"},"thumbnailUrl":"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp","keywords":["Agent","AI"],"articleSection":["Productivity"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#respond"]}]},{"@type":["WebPage","FAQPage"],"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent","url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent","name":"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude - CodeSamplez.com","isPartOf":{"@id":"https:\/\/codesamplez.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#primaryimage"},"image":{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#primaryimage"},"thumbnailUrl":"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp","datePublished":"2026-03-11T18:33:32+00:00","dateModified":"2026-03-11T18:33:34+00:00","description":"Learn how to set up a local LLM for coding with Ollama, Continue.dev, and VS Code. Covers model selection, hardware planning + IDE integration","breadcrumb":{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#breadcrumb"},"mainEntity":[{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773052601299"},{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252309058"},{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252321980"},{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252345429"},{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252354960"},{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252367755"},{"@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252380854"}],"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#primaryimage","url":"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp","contentUrl":"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp","width":1536,"height":1024,"caption":"Local Coding Agent"},{"@type":"BreadcrumbList","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/codesamplez.com\/"},{"@type":"ListItem","position":2,"name":"Local LLM for Coding: Free AI Coding Agent With Ollama + Claude"}]},{"@type":"WebSite","@id":"https:\/\/codesamplez.com\/#website","url":"https:\/\/codesamplez.com\/","name":"CODESAMPLEZ.COM","description":"Programming And Development Resources","publisher":{"@id":"https:\/\/codesamplez.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/codesamplez.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/codesamplez.com\/#organization","name":"codesamplez.com","url":"https:\/\/codesamplez.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/codesamplez.com\/#\/schema\/logo\/image\/","url":"https:\/\/codesamplez.com\/wp-content\/uploads\/2024\/10\/cropped-favicon.webp","contentUrl":"https:\/\/codesamplez.com\/wp-content\/uploads\/2024\/10\/cropped-favicon.webp","width":512,"height":512,"caption":"codesamplez.com"},"image":{"@id":"https:\/\/codesamplez.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/codesamplez","https:\/\/x.com\/codesamplez"]},{"@type":"Person","@id":"https:\/\/codesamplez.com\/#\/schema\/person\/a82c3c07205f4bb73d6b3b0906bc328b","name":"Rana Ahsan","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/5c7a4f88bcf4a55cd1483386318ebecf27359154275a0b355b0ea186676f9f7f?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/5c7a4f88bcf4a55cd1483386318ebecf27359154275a0b355b0ea186676f9f7f?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5c7a4f88bcf4a55cd1483386318ebecf27359154275a0b355b0ea186676f9f7f?s=96&d=mm&r=g","caption":"Rana Ahsan"},"description":"Rana Ahsan is a seasoned software engineer and technology leader specialized in distributed systems and software architecture. With a Master\u2019s in Software Engineering from Concordia University, his experience spans leading scalable architecture at Coursera and TopHat, contributing to open-source projects. This blog, CodeSamplez.com, showcases his passion for sharing practical insights on programming and distributed systems concepts and help educate others. Github | X | LinkedIn","sameAs":["https:\/\/github.com\/ranacseruet","https:\/\/www.facebook.com\/ranacseruet","https:\/\/www.linkedin.com\/in\/ranacseruet\/","https:\/\/x.com\/ranacseruet"],"url":"https:\/\/codesamplez.com\/author\/admin"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773052601299","position":1,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773052601299","name":"What is a local LLM?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"A local LLM for coding is a large language model running entirely on your own hardware \u2014 laptop, workstation, or home server \u2014 that provides AI-powered code generation, refactoring, and agentic multi-file editing without sending a single line of code to the cloud. It matters because it gives you complete privacy, zero recurring costs, and AI coding assistance that works even without internet.","inLanguage":"en-US"},"inLanguage":"en-US"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252309058","position":2,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252309058","name":"What is the best local LLM for agentic coding in 2026?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"For most developers: Qwen 3.5 35B-A3B. It balances strong agentic capabilities, long context (256K), tool calling support, and hardware efficiency (only 3B active parameters). On a 24GB NVIDIA GPU specifically, GLM-4.7-Flash is equally strong. Both run on Claude Code via Ollama without issues.","inLanguage":"en-US"},"inLanguage":"en-US"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252321980","position":3,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252321980","name":"How much VRAM\/RAM do I actually need?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"16GB is the minimum for a usable experience. 24GB VRAM or 32GB unified memory is the sweet spot where agentic coding starts to feel genuinely productive. Below 16GB, the models that support proper tool calling and long context won't fit.","inLanguage":"en-US"},"inLanguage":"en-US"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252345429","position":4,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252345429","name":"Can I use this without a GPU at all?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"Agentic coding requires responsive multi-turn inference \u2014 the agent makes many sequential calls to the model during a single task. CPU-only inference is too slow for this workflow. You need a dedicated GPU or Apple Silicon unified memory.","inLanguage":"en-US"},"inLanguage":"en-US"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252354960","position":5,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252354960","name":"Are local LLMs actually free?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"Yes. Ollama is free, Claude Code is free to run against non-Anthropic endpoints, and the open-weight models are free to download and use. Your only costs are hardware (which you likely already own) and electricity.","inLanguage":"en-US"},"inLanguage":"en-US"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252367755","position":6,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252367755","name":"How does this compare to using Claude Code with Anthropic's API?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"Anthropic's cloud models (Opus 4.6, Sonnet 4.6) are still more capable on the hardest tasks. But for 80% of daily coding work \u2014 building features, fixing bugs, writing tests, scaffolding projects \u2014 a good local model is more than sufficient. And it's free, private, and always available.","inLanguage":"en-US"},"inLanguage":"en-US"},{"@type":"Question","@id":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252380854","position":7,"url":"https:\/\/codesamplez.com\/productivity\/local-ai-coding-agent#faq-question-1773252380854","name":"Can I switch between local and cloud models?","answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"Yes. Remove the environment variables (or use a separate terminal profile) and Claude Code goes back to Anthropic's API. Many developers use local for routine work and cloud for complex tasks. It's a pragmatic setup.","inLanguage":"en-US"},"inLanguage":"en-US"}]}},"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/codesamplez.com\/wp-content\/uploads\/2026\/03\/local-coding-agent.webp","jetpack_shortlink":"https:\/\/wp.me\/p1hHlI-fxX","jetpack_sharing_enabled":true,"jetpack-related-posts":[{"id":59581,"url":"https:\/\/codesamplez.com\/productivity\/best-ai-coding-agents","url_meta":{"origin":59765,"position":0},"title":"Best AI Coding Agents in 2026: The Complete Beginner&#8217;s Guide","author":"Rana Ahsan","date":"February 18, 2026","format":false,"excerpt":"The best AI coding agents in 2026 don't just autocomplete your lines \u2014 they plan, execute, and debug entire features autonomously. Whether you're a weekend hacker or a seasoned engineer, this guide breaks down Google Antigravity, Claude Code, OpenAI Codex, OpenCode, and KiloCode so you can pick the right tool\u2026","rel":"","context":"In &quot;Productivity&quot;","block_context":{"text":"Productivity","link":"https:\/\/codesamplez.com\/category\/productivity"},"img":{"alt_text":"Best AI Coding Agents","src":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/best-ai-coding-agents.webp?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/best-ai-coding-agents.webp?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/best-ai-coding-agents.webp?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/best-ai-coding-agents.webp?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/best-ai-coding-agents.webp?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/best-ai-coding-agents.webp?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":57985,"url":"https:\/\/codesamplez.com\/productivity\/ai-coding-agent","url_meta":{"origin":59765,"position":1},"title":"AI Coding Assistant: VSCode + Cline OSS Setup Guide","author":"Rana Ahsan","date":"February 17, 2025","format":false,"excerpt":"Discover how to supercharge your coding skills with an AI Coding Agent in VSCode, powered by the free Cline OSS extension. Our casual, step-by-step guide reveals personal tips, shares real anecdotes, and offers practical advice, turning your everyday coding tasks into a fast and fun experience. Instantly level up now!","rel":"","context":"In &quot;Productivity&quot;","block_context":{"text":"Productivity","link":"https:\/\/codesamplez.com\/category\/productivity"},"img":{"alt_text":"AI Coding Assistant Agent","src":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/02\/ai-coding-assistant-agent-scaled.webp?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/02\/ai-coding-assistant-agent-scaled.webp?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/02\/ai-coding-assistant-agent-scaled.webp?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/02\/ai-coding-assistant-agent-scaled.webp?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/02\/ai-coding-assistant-agent-scaled.webp?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/02\/ai-coding-assistant-agent-scaled.webp?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":59319,"url":"https:\/\/codesamplez.com\/development\/building-ai-agent-from-scratch","url_meta":{"origin":59765,"position":2},"title":"Building AI Agent From Scratch: Complete Tutorial","author":"Rana Ahsan","date":"December 17, 2025","format":false,"excerpt":"Ever wondered how AI agents like Siri actually work? This comprehensive guide teaches you building AI agent from scratch using Python, covering everything from LLM integration to tool implementation with real code examples.","rel":"","context":"In &quot;Development&quot;","block_context":{"text":"Development","link":"https:\/\/codesamplez.com\/category\/development"},"img":{"alt_text":"Building AI Agent","src":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/12\/building-ai-agent.webp?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/12\/building-ai-agent.webp?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/12\/building-ai-agent.webp?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/12\/building-ai-agent.webp?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/12\/building-ai-agent.webp?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/12\/building-ai-agent.webp?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":59404,"url":"https:\/\/codesamplez.com\/development\/add-memory-to-ai-agent-python-tutorial","url_meta":{"origin":59765,"position":3},"title":"Add Memory to AI Agent: Python Tutorial for Beginners","author":"Rana Ahsan","date":"January 22, 2026","format":false,"excerpt":"Discover how to add memory to AI agent systems using pure Python\u2014no frameworks required. This hands-on tutorial walks you through building short-term conversation tracking and long-term persistent storage, transforming stateless chatbots into context-aware assistants that actually remember your conversations and preferences across sessions.","rel":"","context":"In &quot;Development&quot;","block_context":{"text":"Development","link":"https:\/\/codesamplez.com\/category\/development"},"img":{"alt_text":"Add Memory To AI Agents","src":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/01\/add-memory-to-ai-agents.webp?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/01\/add-memory-to-ai-agents.webp?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/01\/add-memory-to-ai-agents.webp?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/01\/add-memory-to-ai-agents.webp?resize=700%2C400&ssl=1 2x"},"classes":[]},{"id":59460,"url":"https:\/\/codesamplez.com\/development\/building-rag-system-in-python","url_meta":{"origin":59765,"position":4},"title":"A step-by-step guide to building a simple RAG system in Python","author":"Rana Ahsan","date":"February 5, 2026","format":false,"excerpt":"Build a small, practical Retrieval-Augmented Generation (RAG) system in Python: chunk your docs, embed them, store vectors in Chroma, retrieve top matches, and have an LLM answer using only that context. Includes a runnable example and common pitfalls.","rel":"","context":"In &quot;Development&quot;","block_context":{"text":"Development","link":"https:\/\/codesamplez.com\/category\/development"},"img":{"alt_text":"RAG Systems In Python","src":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/rag-system-in-python.webp?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/rag-system-in-python.webp?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/rag-system-in-python.webp?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/rag-system-in-python.webp?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/rag-system-in-python.webp?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2026\/02\/rag-system-in-python.webp?resize=1400%2C800&ssl=1 4x"},"classes":[]},{"id":58380,"url":"https:\/\/codesamplez.com\/devops\/aws-sam-beginners-guide","url_meta":{"origin":59765,"position":5},"title":"AWS SAM: Build And Deploy Lambda Effortlessly","author":"Rana Ahsan","date":"April 21, 2025","format":false,"excerpt":"Kick\u2011start your serverless journey with this hands\u2011on AWS\u00a0SAM Tutorial. Learn why SAM matters, scaffold your first Lambda in minutes, test locally with Docker, and automate deployments using GitHub\u00a0Actions. Packed with code samples, real\u2011world tips, and troubleshooting hacks from a veteran dev, this guide turns beginners into confident builders.","rel":"","context":"In &quot;DevOps&quot;","block_context":{"text":"DevOps","link":"https:\/\/codesamplez.com\/category\/devops"},"img":{"alt_text":"AWS SAM Tutorial For Beginners","src":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/04\/aws-sam.webp?resize=350%2C200&ssl=1","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/04\/aws-sam.webp?resize=350%2C200&ssl=1 1x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/04\/aws-sam.webp?resize=525%2C300&ssl=1 1.5x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/04\/aws-sam.webp?resize=700%2C400&ssl=1 2x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/04\/aws-sam.webp?resize=1050%2C600&ssl=1 3x, https:\/\/i0.wp.com\/codesamplez.com\/wp-content\/uploads\/2025\/04\/aws-sam.webp?resize=1400%2C800&ssl=1 4x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/posts\/59765","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/comments?post=59765"}],"version-history":[{"count":9,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/posts\/59765\/revisions"}],"predecessor-version":[{"id":59775,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/posts\/59765\/revisions\/59775"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/media\/59776"}],"wp:attachment":[{"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/media?parent=59765"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/categories?post=59765"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/codesamplez.com\/wp-json\/wp\/v2\/tags?post=59765"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}