{"id":309,"date":"2025-08-19T16:22:32","date_gmt":"2025-08-19T16:22:32","guid":{"rendered":"https:\/\/trydecoding.com\/?page_id=309"},"modified":"2026-02-26T17:22:31","modified_gmt":"2026-02-26T17:22:31","slug":"llmstxt","status":"publish","type":"page","link":"https:\/\/trydecoding.com\/llmstxt\/","title":{"rendered":"Free llms.txt generator"},"content":{"rendered":"\n<p>Optimize how Gen AI and Language Models (LLMs) understand the content of your website with Decoding&#8217;s new free tool, the LLMs.txt Generator, which allows you to add the most important content of your website and download a structured text file. <\/p>\n\n\n\n<p>The llms.txt file is a proposed web standard, not official or required, but might help LLMs process websites&#8217; content with more context.<\/p>\n\n\n\n<section id=\"llms-txt-tool\">\n  <div>\n    <label for=\"defaultPolicy\"><strong>Default policy<\/strong><\/label>\n    <select id=\"defaultPolicy\">\n      <option value=\"allow\">Allow all LLM usage<\/option>\n      <option value=\"disallow\" selected>Disallow all LLM usage<\/option>\n      <option value=\"custom\">Custom (use rules below)<\/option>\n    <\/select>\n  <\/div>\n\n  <div>\n    <label for=\"allowPaths\"><strong>Allow paths<\/strong> (one per line)<\/label><br>\n    <textarea id=\"allowPaths\" rows=\"3\" placeholder=\"\/public\n\/press-kit\"><\/textarea>\n  <\/div>\n\n  <div>\n    <label for=\"disallowPaths\"><strong>Disallow paths<\/strong> (one per line)<\/label><br>\n    <textarea id=\"disallowPaths\" rows=\"5\" placeholder=\"\/private\n\/drafts\n\/api\/\"><\/textarea>\n  <\/div>\n\n  <div>\n    <label for=\"crawlDelay\"><strong>Crawl-Delay<\/strong> (seconds, optional)<\/label>\n    <input id=\"crawlDelay\" type=\"number\" min=\"0\" step=\"1\" placeholder=\"10\" \/>\n  <\/div>\n\n  <fieldset>\n    <legend><strong>Block specific AI crawlers (via dedicated sections)<\/strong><\/legend>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"GPTBot\" checked> OpenAI \u2014 GPTBot<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"Google-Extended\" checked> Google \u2014 Google-Extended<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"ClaudeBot\"> Anthropic \u2014 ClaudeBot<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"PerplexityBot\"> Perplexity \u2014 PerplexityBot<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"CCBot\"> Common Crawl \u2014 CCBot<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"Amazonbot\"> Amazon \u2014 Amazonbot<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"Applebot-Extended\"> Apple \u2014 Applebot-Extended<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"Bytespider\"> ByteDance \u2014 Bytespider<\/label><br>\n    <label><input type=\"checkbox\" class=\"bot\" value=\"DataForSeoBot\"> DataForSEO \u2014 DataForSeoBot<\/label>\n  <\/fieldset>\n\n  <div>\n    <label for=\"sitemaps\"><strong>Sitemaps<\/strong> (full URLs, one per line)<\/label><br>\n    <textarea id=\"sitemaps\" rows=\"2\" placeholder=\"https:\/\/example.com\/sitemap.xml\"><\/textarea>\n  <\/div>\n\n  <div>\n    <label for=\"contact\"><strong>Contact<\/strong> (email or URL, optional)<\/label>\n    <input id=\"contact\" type=\"text\" placeholder=\"legal@example.com or https:\/\/example.com\/contact\" \/>\n  <\/div>\n\n  <div>\n    <label for=\"license\"><strong>License (optional)<\/strong> \u2014 a URL stating your usage terms<\/label>\n    <input id=\"license\" type=\"text\" placeholder=\"https:\/\/example.com\/content-license\" \/>\n  <\/div>\n\n  <div>\n    <label for=\"notes\"><strong>Notes (comment only, optional)<\/strong><\/label><br>\n    <textarea id=\"notes\" rows=\"2\" placeholder=\"Any additional info you want to include as comments.\"><\/textarea>\n  <\/div>\n\n  <div>\n    <button id=\"copyBtn\">Copy<\/button>\n    <button id=\"downloadBtn\">Download llms.txt<\/button>\n    <button id=\"resetBtn\" type=\"button\">Reset<\/button>\n  <\/div>\n\n  <div>\n    <label for=\"output\"><strong>Generated llms.txt<\/strong><\/label><br>\n    <textarea id=\"output\" rows=\"18\" style=\"width: 100%;height: 229px;\" readonly><\/textarea>\n  <\/div>\n<\/section>\n\n<script>\n\/*\n  Simple llms.txt Generator\n  - This is an *experimental* helper for publishing your AI\/LLM usage preferences.\n  - It emits a plain-text, robots.txt-like policy in a file named \"llms.txt\".\n  - Because the ecosystem is evolving, also consider mirroring critical rules in robots.txt\n    under the appropriate User-agent blocks for broader crawler compliance.\n*\/\n\n(function () {\n  const els = {\n    policy: document.getElementById('defaultPolicy'),\n    allow: document.getElementById('allowPaths'),\n    disallow: document.getElementById('disallowPaths'),\n    delay: document.getElementById('crawlDelay'),\n    bots: Array.from(document.querySelectorAll('.bot')),\n    sitemaps: document.getElementById('sitemaps'),\n    contact: document.getElementById('contact'),\n    license: document.getElementById('license'),\n    notes: document.getElementById('notes'),\n    output: document.getElementById('output'),\n    copy: document.getElementById('copyBtn'),\n    download: document.getElementById('downloadBtn'),\n    reset: document.getElementById('resetBtn')\n  };\n\n  function nowISODate() {\n    const d = new Date();\n    const y = d.getFullYear();\n    const m = String(d.getMonth() + 1).padStart(2, '0');\n    const day = String(d.getDate()).padStart(2, '0');\n    return `${y}-${m}-${day}`;\n  }\n\n  function cleanLines(text) {\n    return (text || '')\n      .split('\\n')\n      .map(s => s.trim())\n      .filter(Boolean);\n  }\n\n  function normalizePath(p) {\n    if (!p) return '';\n    \/\/ Ensure starts with a single leading slash\n    if (!p.startsWith('\/')) p = '\/' + p;\n    \/\/ Collapse multiple slashes\n    p = p.replace(\/\\\/{2,}\/g, '\/');\n    return p;\n  }\n\n  function buildLlmsTxt() {\n    const lines = [];\n    lines.push(`# llms.txt \u2014 generated ${nowISODate()}`);\n    lines.push(`# This file expresses your preferences for AI\/LLM crawling & training.`);\n    lines.push(`# NOTE: Adoption varies across providers. Consider mirroring key rules in robots.txt.`);\n    lines.push('');\n\n    const notes = els.notes.value.trim();\n    if (notes) {\n      lines.push(`# Notes: ${notes}`);\n      lines.push('');\n    }\n\n    const contact = els.contact.value.trim();\n    if (contact) lines.push(`Contact: ${contact}`);\n\n    const license = els.license.value.trim();\n    if (license) lines.push(`License: ${license}`);\n\n    const sitemapLines = cleanLines(els.sitemaps.value).filter(u => \/^https?:\\\/\\\/\/i.test(u));\n    sitemapLines.forEach(u => lines.push(`Sitemap: ${u}`));\n    if (sitemapLines.length) lines.push('');\n\n    \/\/ Default policy\n    const policy = els.policy.value; \/\/ allow | disallow | custom\n    if (policy === 'allow') {\n      lines.push(`Policy: allow`);\n    } else if (policy === 'disallow') {\n      lines.push(`Policy: disallow`);\n    } else {\n      lines.push(`Policy: custom`);\n    }\n\n    \/\/ Global section (User-agent: *)\n    lines.push('');\n    lines.push(`User-agent: *`);\n\n    const allowPaths = cleanLines(els.allow.value).map(normalizePath);\n    const disallowPaths = cleanLines(els.disallow.value).map(normalizePath);\n\n    \/\/ If default policy is \"allow\", make disallows explicit; if \"disallow\", make allows explicit.\n    if (policy === 'allow') {\n      if (disallowPaths.length) {\n        disallowPaths.forEach(p => lines.push(`Disallow: ${p}`));\n      } else {\n        lines.push(`Disallow:`);\n      }\n      if (allowPaths.length) allowPaths.forEach(p => lines.push(`Allow: ${p}`));\n    } else if (policy === 'disallow') {\n      if (allowPaths.length) {\n        allowPaths.forEach(p => lines.push(`Allow: ${p}`));\n      }\n      \/\/ Explicit full-site disallow to be clear:\n      if (!disallowPaths.length) lines.push(`Disallow: \/`);\n      else disallowPaths.forEach(p => lines.push(`Disallow: ${p}`));\n    } else {\n      \/\/ custom: emit both as provided\n      if (allowPaths.length) allowPaths.forEach(p => lines.push(`Allow: ${p}`));\n      if (disallowPaths.length) disallowPaths.forEach(p => lines.push(`Disallow: ${p}`));\n      if (!allowPaths.length && !disallowPaths.length) lines.push(`Disallow:`); \/\/ empty directive is allowed\n    }\n\n    const delay = els.delay.value.trim();\n    if (delay !== '' && Number(delay) >= 0) {\n      lines.push(`Crawl-Delay: ${Number(delay)}`);\n    }\n\n    \/\/ Dedicated blocks for selected known AI crawlers\n    const selectedBots = els.bots.filter(b => b.checked).map(b => b.value);\n    if (selectedBots.length) {\n      lines.push('');\n      lines.push(`# Dedicated sections for specific AI crawlers`);\n      selectedBots.forEach(ua => {\n        lines.push('');\n        lines.push(`User-agent: ${ua}`);\n        \/\/ By default, align with site-wide policy. If disallow, block entirely; else inherit but still show intent.\n        if (policy === 'disallow') {\n          lines.push(`Disallow: \/`);\n        } else if (policy === 'allow') {\n          lines.push(`Allow: \/`);\n        } else {\n          \/\/ custom: echo both lists for clarity\n          if (allowPaths.length) allowPaths.forEach(p => lines.push(`Allow: ${p}`));\n          if (disallowPaths.length) disallowPaths.forEach(p => lines.push(`Disallow: ${p}`));\n          if (!allowPaths.length && !disallowPaths.length) lines.push(`Allow: \/`);\n        }\n        if (delay !== '' && Number(delay) >= 0) {\n          lines.push(`Crawl-Delay: ${Number(delay)}`);\n        }\n      });\n    }\n\n    \/\/ Final helpful comment\n    lines.push('');\n    lines.push(`# End of file`);\n\n    return lines.join('\\n');\n  }\n\n  function render() {\n    els.output.value = buildLlmsTxt();\n  }\n\n  \/\/ Event wiring\n  [\n    els.policy, els.allow, els.disallow, els.delay,\n    els.sitemaps, els.contact, els.license, els.notes\n  ].forEach(el => el.addEventListener('input', render));\n\n  els.bots.forEach(cb => cb.addEventListener('change', render));\n\n  els.copy.addEventListener('click', async function () {\n    const text = els.output.value;\n    try {\n      await navigator.clipboard.writeText(text);\n      this.textContent = 'Copied!';\n      setTimeout(() => (this.textContent = 'Copy'), 1200);\n    } catch {\n      \/\/ Fallback\n      els.output.select();\n      document.execCommand('copy');\n    }\n  });\n\n  els.download.addEventListener('click', function () {\n    const blob = new Blob([els.output.value], { type: 'text\/plain;charset=utf-8' });\n    const url = URL.createObjectURL(blob);\n    const a = document.createElement('a');\n    a.href = url;\n    a.download = 'llms.txt';\n    document.body.appendChild(a);\n    a.click();\n    URL.revokeObjectURL(url);\n    a.remove();\n  });\n\n  els.reset.addEventListener('click', function () {\n    if (!confirm('Clear all inputs and reset to defaults?')) return;\n    els.policy.value = 'disallow';\n    els.allow.value = '';\n    els.disallow.value = '\/private\\n\/drafts\\n\/api\/';\n    els.delay.value = '';\n    els.sitemaps.value = '';\n    els.contact.value = '';\n    els.license.value = '';\n    els.notes.value = '';\n    els.bots.forEach((cb) => {\n      cb.checked = (cb.value === 'GPTBot' || cb.value === 'Google-Extended');\n    });\n    render();\n  });\n\n  \/\/ Initial seed\n  els.disallow.value = '\/private\\n\/drafts\\n\/api\/';\n  render();\n})();\n<\/script>\n\n\n\n<h2 class=\"wp-block-heading\">Instructions<\/h2>\n\n\n\n<p>Here\u2019s how to use the <strong>llms.txt Generator<\/strong> on the page:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Pick a default policy<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Allow all<\/strong>: You\u2019re fine with AI\/LLM crawlers using your content.<\/li>\n\n\n\n<li><strong>Disallow all<\/strong> (default): You don\u2019t want AI\/LLM crawlers to use your content.<\/li>\n\n\n\n<li><strong>Custom<\/strong>: You\u2019ll specify exactly which parts are allowed\/disallowed below.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Set path rules (one per line)<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Allow paths<\/strong>: Enter folders or pages you want to allow (e.g., <code>\/public<\/code>, <code>\/press-kit<\/code>).<\/li>\n\n\n\n<li><strong>Disallow paths<\/strong>: Enter folders or pages you want to block (e.g., <code>\/private<\/code>, <code>\/drafts<\/code>, <code>\/api\/<\/code>).<\/li>\n\n\n\n<li>Tips:\n<ul class=\"wp-block-list\">\n<li>Paths should start with a <code>\/<\/code> (root-relative).<\/li>\n\n\n\n<li>A trailing <code>\/<\/code> usually means \u201cthis folder and everything inside it.\u201d<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>(Optional) Set Crawl-Delay<\/strong><br>Enter a number of seconds to slow compliant crawlers (e.g., <code>10<\/code>). Leave blank for no delay.<\/li>\n\n\n\n<li><strong>Choose specific AI crawlers to target<\/strong><br>Tick any bots you want dedicated sections for (e.g., <strong>GPTBot<\/strong>, <strong>Google-Extended<\/strong>, <strong>ClaudeBot<\/strong>, <strong>PerplexityBot<\/strong>).\n<ul class=\"wp-block-list\">\n<li>If your <strong>default policy<\/strong> is <strong>Disallow<\/strong>, selected bots get <code>Disallow: \/<\/code>.<\/li>\n\n\n\n<li>If <strong>Allow<\/strong>, selected bots get <code>Allow: \/<\/code>.<\/li>\n\n\n\n<li>If <strong>Custom<\/strong>, they mirror your path rules.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>(Optional) Add Sitemaps<\/strong><br>Paste full URLs, one per line (e.g., <code>https:\/\/example.com\/sitemap.xml<\/code>).<\/li>\n\n\n\n<li><strong>(Optional) Add Contact &amp; License<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Contact<\/strong>: Email or URL where people can reach you.<\/li>\n\n\n\n<li><strong>License<\/strong>: A URL describing your content usage terms.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>(Optional) Notes<\/strong><br>Add any comments (they\u2019ll appear as <code>#<\/code> comments at the top of the file).<\/li>\n\n\n\n<li><strong>Generate the file<\/strong><br>The \u201cGenerated llms.txt\u201d box updates automatically as you type.<\/li>\n\n\n\n<li><strong>Export your file<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Copy<\/strong>: Click <strong>Copy<\/strong>, then paste into a new file named <code>llms.txt<\/code>.<\/li>\n\n\n\n<li><strong>Download<\/strong>: Click <strong>Download llms.txt<\/strong> to save it directly.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Publish it on your site<\/strong>\n<ul class=\"wp-block-list\">\n<li>Upload <code>llms.txt<\/code> to the <strong>root<\/strong> of your site (the top level).\n<ul class=\"wp-block-list\">\n<li>Example: it should be reachable at <code>https:\/\/yourdomain.com\/llms.txt<\/code>.<\/li>\n\n\n\n<li>For a subdomain, use that host\u2019s root (e.g., <code>https:\/\/blog.yourdomain.com\/llms.txt<\/code>).<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Verify it\u2019s live<\/strong>\n<ul class=\"wp-block-list\">\n<li>Visit the URL in your browser to confirm it loads.<\/li>\n\n\n\n<li>If you use a CDN or cache, purge\/clear it.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>(Recommended) Mirror key rules in robots.txt<\/strong><br>Some crawlers mainly honor <code>robots.txt<\/code>. Consider adding matching user-agent blocks there (e.g., a <code>User-agent: GPTBot<\/code> section) so your intent is clear.<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Quick tips &amp; gotchas<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Keep paths simple (no wildcards needed).<\/li>\n\n\n\n<li>If you choose <strong>Disallow all<\/strong>, but want some public areas open, list them in <strong>Allow paths<\/strong> (e.g., <code>\/public<\/code>).<\/li>\n\n\n\n<li>Don\u2019t accidentally block your sitemap if you want crawlers to find it.<\/li>\n\n\n\n<li>Adoption of <code>llms.txt<\/code> varies; pairing with <code>robots.txt<\/code> increases coverage.<\/li>\n\n\n\n<li>You can return anytime, adjust settings, and re-download a fresh <code>llms.txt<\/code>.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What is LLMs.txt?<\/h2>\n\n\n\n<p>llms.txt is a new proposed web standard, just like robots.txt, but designed for for AI consumption, so models like ChatGPT, Perplexity, or Claude can quickly summarize or generate content based on your website.<\/p>\n\n\n\n<p>This file will help AI systems understand and give more context and to attribute correctly your content in LLM and Gen-AI search answers.<\/p>\n\n\n\n<p>LLMs.txt is a concise, markdown-formatted, summary of your website&#8217;s most important content, including internal links.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Why is this file important?<\/h3>\n\n\n\n<p>LLMs and GenAI-powered search have a small window of context to handle the complexity of large websites, the llms.txt helps give quickly that context with the most important and relevant content.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Can Google or other search engines use my LLMs.txt?<\/h3>\n\n\n\n<p>Currently, search engines don&#8217;t use LLMs.txt. However, as AI integration grows, it&#8217;s possible that search engines might leverage this LLMs.txt files in the future.<\/p>\n\n\n\n<div style=\"height:100px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Optimize how Gen AI and Language Models (LLMs) understand the content of your website with Decoding&#8217;s new free tool, the LLMs.txt Generator, which allows you to add the most important content of your website and download a structured text file. The llms.txt file is a proposed web standard, not official or required, but might help [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-309","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/pages\/309","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/comments?post=309"}],"version-history":[{"count":9,"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/pages\/309\/revisions"}],"predecessor-version":[{"id":1742,"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/pages\/309\/revisions\/1742"}],"wp:attachment":[{"href":"https:\/\/trydecoding.com\/wp-json\/wp\/v2\/media?parent=309"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}