Odyssey LLMS

Description

Odyssey LLMS is the definitive control panel for your website’s AI presence. It generates the critical files used by Large Language Models (like ChatGPT, Claude, and Gemini) to understand and cite your content.

For Beginners: Just activate it. A comprehensive and optimised llms.txt file is instantly generated. No configuration is needed.

For Power Users: Manage every aspect of your AI strategy. Track bot traffic with built-in analytics, generate JSONL datasets for fine-tuning, and clean up your content with CSS selectors.

Concepts Explained: Why do you need this?

1. What is llms.txt?

Think of llms.txt as a “Sitemap for AI”. While humans use HTML pages and Search Engines use XML sitemaps, AI agents look for an llms.txt file in your root directory. This file gives them a clean, prioritised list of links to crawl, ensuring they train on your best content and ignore the junk.

2. What is llms-full.txt (Markdown)?

This is an optional advanced feature (RAG-Ready). Instead of just providing links, llms-full.txt provides your actual website content converted into clean, lightweight Markdown format.
* Why it’s useful: It allows AI agents to ingest your website’s knowledge immediately without needing to visit and scrape every single HTML page. This reduces server load and ensures the AI gets accurate data for “Retrieval Augmented Generation” (RAG).
* ⚠️ WARNING regarding Virtual Mode Limits: When using Virtual Mode to generate this file, the item limit for the llms-full.txt file is securely capped at 50 by default. Manually increasing this limit beyond 50 in the ‘Tools’ settings will drastically increase server load and risks causing immediate 500/503 server crashes. Use this feature at your own risk. If you require more than 50 items in your llms-full.txt file, we recommend using Physical Mode instead.

3. What is llms.jsonl (Fine-Tuning)?

This file formats your content into prompt-completion pairs (JSON Lines). This is the standard format used to fine-tune models like GPT-4 or Llama 3 on your specific data.

New Features in 6.0:

  • JSONL Generator: Create a dataset ready for fine-tuning custom AI models.
  • Content Cleaning: Use CSS selectors (e.g., .sidebar, .comments) to strip unwanted elements from your Markdown and JSONL files.
  • Visual Analytics: Visualise exactly which AI bots (ChatGPT, Claude, Google AI, etc.) are accessing your files with a built-in dashboard widget and charts.
  • WooCommerce Integration: Automatically generates a structured “Products” section including Price, Stock, and SKU.

Key Features:

  • Clean Tabbed Interface: Organised into General Rules, Content Sourcing, Analytics, Robots.txt, Security, and Tools.
  • Granular Bot-Specific Rules: Set detailed Allow or Disallow rules for individual AI crawlers (GPTBot, Google-Extended, etc.).
  • “Block All by Default” Mode: Create a secure “whitelist” by blocking all crawlers by default and only allowing the bots you explicitly enable.
  • Settings Import & Export: Perfect for agencies. Easily back up, restore, and migrate your settings between sites with JSON import/export.
  • Advanced Scheduling: Regenerate your file on save, manually, or on a recurring schedule (Hourly, Daily, Weekly, Monthly).
  • Safety Validator: Prevents accidental blocking of all traffic in robots.txt.

Screenshots

Installation

  1. Upload the odyssey-llms folder to the /wp-content/plugins/ directory.
  2. Activate the plugin through the ‘Plugins’ menu in your WordPress dashboard.
  3. That’s it! A llms.txt file has been generated. A new “Odyssey LLMS” menu item will appear in your admin sidebar.
  4. (Optional) Navigate to the Odyssey LLMS page to view Analytics or configure advanced settings.

FAQ

Where do I find the settings page?

The settings page is located in its own top-level menu in your WordPress admin sidebar, labelled Odyssey LLMS.

How do I clean up the Markdown content?

In the “Content Intelligence” tab, look for the “Content Cleaning (CSS)” field. Enter CSS selectors for elements you want to remove, separated by commas (e.g., .footer, .nav, #sidebar).

How do I enable Analytics?

Go to the main settings tab and select “Virtual File” as your serving method. This allows WordPress to intercept bot requests and log them to your dashboard.

Does this conflict with Yoast SEO or RankMath?

No. The Robots.txt editor includes intelligent conflict detection. It will automatically fetch and import any virtual rules created by other SEO plugins so you don’t lose them.

Can I increase the limit of pages in llms-full.txt?

⚠️ WARNING: While you can manually increase the post/page limit for the llms-full.txt file in the settings, we strongly warn against setting it too high (especially in Virtual Mode). Doing so will drastically increase server load and risks causing immediate 500/503 server crashes due to the heavy processing required. If you require a large number of items, we recommend using Physical Mode.

What is the JSONL Prompt Template and how does I use it?

The JSONL Prompt Template controls the “question” side of each entry in your llms.jsonl fine-tuning dataset. When you want to train a custom AI model (such as GPT-4, Llama 3, or any fine-tuneable LLM) on your website’s content, that process requires structured “Question and Answer” pairs — known as prompt and completion.

How it works:

The plugin generates llms.jsonl where every line is a JSON object in this format:

{"prompt": "What is My Post Title?", "completion": "The full text of that post..."}

The template field lets you define the prompt structure for every post on your site. Use {{title}} as a dynamic placeholder — the plugin will automatically replace it with each post’s actual title at generation time.

Examples by use case:

  • Default / General: What is {{title}}?
  • Service business: Tell me about the {{title}} service provided by our company.
  • E-commerce / Products: What are the features and specifications of {{title}}?
  • Knowledge base / Support: How do I {{title}}?

By customising this template, you are priming your fine-tuning dataset so the resulting model learns to respond to the exact style of queries your users will ask.

What is the difference between Physical File and Virtual File mode?

Physical File (Recommended for most sites): The plugin writes a real static llms.txt file to your server’s root directory. This is served directly by your web server (Apache/Nginx) with maximum speed, and does not require WordPress to load for every bot request. The downside is that bot visits cannot be tracked for Analytics.

Virtual File (Enable Analytics): The file is served dynamically by WordPress via a rewrite rule. This allows the plugin to intercept every request and log it to the Analytics dashboard. It also enables Rate Limiting to throttle abusive bots. The trade-off is a small performance overhead since WordPress must load for each request.

Both modes support llms.txt, llms-full.txt (Markdown), and llms.jsonl generation.

What does “Include Taxonomy Archives” do?

When you select specific taxonomies (e.g. Categories, Tags, WooCommerce Product Categories) in the Content Source tab, the plugin will fetch all non-empty terms for those taxonomies and include their archive page URLs in the generated llms.txt. This is useful for giving AI crawlers a complete picture of your site’s topic structure, not just individual posts.

What does “Include Author Archives” do?

When this option is enabled, the plugin fetches all WordPress users who have at least one published post and appends their author archive URLs (e.g. yoursite.com/author/name/) to the generated llms.txt. This is useful for sites where author credibility and profiles are part of the content strategy.

How does the WooCommerce integration work?

When WooCommerce Products is enabled in the Content Source settings and WooCommerce is active, the plugin automatically enriches product entries with structured metadata pulled directly from WooCommerce:

  • Price — the product’s formatted sale/regular price.
  • SKU — the product’s stock-keeping unit identifier (if set).
  • Stock Status — whether the product is In Stock or Out of Stock.

This data is prepended to the product content in both llms-full.txt (Markdown) and llms.jsonl, giving AI models accurate, up-to-date product information without needing to scrape individual product pages.

What is the Content Order setting?

The Content Order setting in the Content Source tab controls the order in which posts appear across all generated files (llms.txt, llms-full.txt, llms.jsonl). Since AI agents typically give more weight to content they encounter earlier in a file, ordering by your most recent or most engaged-with content ensures they train on your best material first.

Available options:

  • Date: Newest First (Default) — Most recently published content appears first.
  • Date: Oldest First — Chronological order from the beginning.
  • Title A Z — Alphabetical by post title.
  • Most Commented First — Prioritises your most engaged content.
  • Menu/Page Order — Uses the WordPress page order field, ideal for structured sites.

How does rate limiting work?

When Virtual File mode is active, the plugin tracks the number of requests made by each IP address within a rolling time window. If a bot or visitor exceeds the configured request limit, they are temporarily blocked and receive a 429 Too Many Requests response with a Retry-After header. The block duration is configurable in the Security tab and enforces a minimum of 1 hour to prevent configuration errors from causing permanent lockouts.

Reviews

26, Xineru de 2026
I’ve been using Odyssey LLMS on a few of my client sites and it’s quickly become an essential part of our WordPress toolkit. Setup is easy and right out of the box it generates an llms.txt file that helps AI models like ChatGPT and Gemini understand our content much better. Installing and activating the plugin does the job of generating the llms.txt with no fuss. The interface is clean and the bot-specific rules mean we can tailor exactly how different crawlers interact with our content. Overall, this feels easy to use and a good way to future-proof my client websites for AI.
Read all 2 reviews

Contributors & Developers

“Odyssey LLMS” is open source software. The following people have contributed to this plugin.

Contributors

Translate “Odyssey LLMS” into your language.

Interested in development?

Browse the code, check out the SVN repository, or subscribe to the development log by RSS.

Changelog

6.2.0

  • SECURITY: Several internal security hardening improvements.
  • BUGFIX: Fixed an issue where certain valid characters were incorrectly stripped when saving robots.txt content.
  • BUGFIX: Fixed an issue where the sitemap tester could reference stale data instead of the live input value.
  • FEATURE: New Site Title and Site Description fields to populate standard header directives in generated files.
  • FEATURE: New Content Ordering option to control how posts are sorted across all generated files.
  • FEATURE: New JSONL Prompt Template field for customising fine-tuning dataset prompts.
  • FEATURE: New Markdown Grouping toggle to organise llms-full.txt output by post type.
  • FEATURE: New Test Sitemap button to validate a configured sitemap URL directly from the settings page.
  • FEATURE: Added Copy to Clipboard buttons throughout the plugin UI.
  • FEATURE: New Custom Bot Rules repeater UI for managing additional AI crawler entries without editing files.
  • FEATURE: REST API endpoints added for integration and developer access.
  • FEATURE: Analytics tab now includes a date range selector and a per-file breakdown table.
  • FEATURE: WP-CLI status command enhanced with scheduling details and a 30-day analytics summary.
  • FEATURE: Automatic analytics log pruning via a scheduled background task.
  • FEATURE: Clean uninstall — all plugin data is fully removed when the plugin is deleted.
  • PERFORMANCE: Various internal refactoring and optimisation improvements.
  • COMPATIBILITY: Improved compatibility with multisite and a broader range of hosting environments.
  • SECURITY FIX: Rate limiter block duration now enforces a minimum of 1 hour, preventing a zero-value option from creating a permanent, unrecoverable IP lockout via a never-expiring transient.
  • BUGFIX: Pruning cron job now correctly uses the plugin’s registered odyssey_weekly schedule instead of the unregistered built-in weekly string, ensuring consistent background pruning on all environments.
  • BUGFIX: Robots.txt safety validator now normalises whitespace before checking for dangerous Disallow: / directives, closing a bypass where multiline input could slip through the check.
  • FEATURE: Taxonomy archive URLs are now correctly included in llms.txt when taxonomies are selected in the Content Source settings (previously saved but never written to output).
  • FEATURE: Author archive URLs are now correctly included in llms.txt when the Author Archives option is enabled (previously saved but never written to output).
  • FEATURE: WooCommerce product metadata (Price, SKU, Stock Status) is now correctly appended to llms-full.txt and llms.jsonl when the WooCommerce integration is enabled (previously saved but never applied to output).

6.1.14

  • SECURITY: Implemented security enhancements and additional safety checks.
  • FEATURE: Enhanced content parsing to natively compile layout builder blocks and formats prior to extraction.
  • FEATURE: Added Virtual Mode Item Limit setting for advanced users adjusting Markdown/JSONL generation limits.
  • FEATURE: Enabled “View llms-full.txt” and “View llms.jsonl” buttons in the Physical Mode UI dashboard.
  • FIX: Addressed an issue where password-protected and empty layout pages were erroneously included in the generated output.
  • FIX: Addressed various bugs to improve overall stability and reliability.
  • TWEAK: Improved Markdown conversion to remove excessive whitespace and line breaks from generated files.
  • TWEAK: Removed the Estimated Tokens display from the File Status dashboard widget.

6.1.13

  • FIX: Resolved file permission issue by ensuring previous files are deleted before regeneration.
  • REFACTOR: Broken down the single monolithic code into separate files.
  • FEATURE: Introduced CLI support.

6.1.12

  • FIX: Minor bug fixes and improvements.
  • TWEAK: Updated analytics tracking for better accuracy.

6.1.11

  • SECURITY: Security audit fixes and hardening.
  • SECURITY: Improved input sanitisation and XML parsing safety.
  • FIX: Improved IP address detection for rate limiting behind proxies.
  • TWEAK: Enhanced error logging for administrative actions.

6.1.8

  • FIX: Resolved an issue where Custom URLs were not being included in the generated file.
  • FEATURE: Added “Exclude Media/Attachments” option to filter out images, PDFs, and other media files from the content list.

6.1.7

  • TWEAK: Updated admin interface.

6.0.8

  • FIX: Resolved Chart.js syntax errors (“Cannot use import statement”) that prevented Analytics graphs from loading.
  • FIX: Fixed a persistence bug where the “Next Scheduled Run” date would disappear after page reload.
  • TWEAK: Chat Preview now prioritises llms-full.txt (Markdown) for richer context if available.
  • TWEAK: Increased Chat API timeout to 60 seconds to handle larger content contexts.

6.0.0

  • FEATURE: Chat Preview: Interactive chat interface to test your content’s AI readiness.
  • FEATURE: JSONL Generator: Generate fine-tuning datasets (llms.jsonl).
  • FEATURE: Content Cleaning: Remove specific HTML elements (divs, classes, IDs) from generated content using CSS selectors.

5.8.3

  • FIX: Resolved a scheduling persistence issue where the ‘Next Scheduled’ date would incorrectly reset to ‘Not Scheduled’ on page reload.
  • FIX: Corrected the Analytics dashboard UI to properly hide the chart container when no data is available, displaying the empty state correctly.

5.8.2

  • FEATURE: Simulate Bot Visit: Added a button to the Analytics tab to simulate a bot hit, allowing users to verify logging functionality immediately.
  • UI: Improved “Empty State” design for the Analytics tab when no data is available.
  • FIX: Resolved a scheduling conflict where “Monthly” schedules were not persisting due to WordPress cron key conflicts.
  • TWEAK: Added logic to hide the “Scheduling” dropdown when Virtual Mode is active (as virtual files are always live).

5.8.0

  • FEATURE: Markdown Generator: Added option to generate llms-full.txt with full content converted to Markdown.
  • FEATURE: Visual Analytics: Added Chart.js integration to visualise bot traffic distribution.
  • FEATURE: Advanced Filtering: Added Date Range and Category Exclusion filters to Content Sourcing.
  • FEATURE: WooCommerce Support: Added option to include a dedicated Products section with Price, Stock, and SKU data.
  • FEATURE: Smart Summaries: Added option to append SEO Meta Descriptions to URLs.
  • FEATURE: Safe Robots.txt Editor: Added a dedicated editor that fetches existing virtual rules to prevent data loss.
  • FEATURE: Virtual Serving Mode: Added option to serve files virtually for analytics tracking.

5.7.1

  • FEATURE: AI Analytics Dashboard: Added a lightweight logger and dashboard widget to track hits from major AI bots (ChatGPT, Claude, Gemini, etc.) over the last 30 days.
  • FEATURE: Robots.txt Editor with Fetcher: Added a dedicated tab to edit robots.txt. Includes logic to automatically fetch existing virtual rules (from Yoast/RankMath) to prevent data loss.
  • FEATURE: Safety Validator: The Robots editor now intercepts saves to warn users if they are about to block all traffic (User-agent: * Disallow: /).
  • FEATURE: Smart Summaries: Added option to append SEO Meta Descriptions or Excerpts to URLs in llms.txt for better AI context.
  • FEATURE: WooCommerce Support: Added option to include a dedicated Products section with Price, Stock, and SKU data.
  • FEATURE: Virtual Serving Mode: Added option to serve llms.txt virtually (required for Analytics) or physically (for performance).
  • FIX: Resolved cron schedule scoping issue where ‘Weekly’ and ‘Monthly’ schedules were not being registered globally.
  • UI: Fixed CSS layout issue where the Live Preview box would expand horizontally off the screen; added text wrapping.

5.4.6

  • FIX: Corrected a critical settings validation bug where unchecking a checkbox was not being registered as a change.
  • ENHANCEMENT: The ‘Save Changes’ button is now correctly positioned at the bottom of the ‘Tools & Scheduling’ tab.

5.3.3

  • FIX: Resolved a critical HTML structure issue that caused the ‘Save Changes’ button to fail and AJAX controls to be unresponsive.
  • FIX: Corrected form submission logic to ensure the user always stays on the currently active tab after saving settings.
  • FIX: The ‘Exclude Posts/Pages’ field now correctly expands to the full width of its container.
  • FIX: Removed a stray border line that was incorrectly overlapping content on several tabs.
  • ENHANCEMENT: The ‘Save Changes’ button is now logically positioned at the bottom of the ‘Tools & Scheduling’ tab.
  • ENHANCEMENT: The plugin now provides a “No changes were made” notice instead of an unnecessary redirect when saving without modifications.

5.2.1

  • FIX: Resolved a JavaScript bug that caused settings tabs to appear empty or become unresponsive after being clicked.
  • FIX: Ensured the llms.txt file is now correctly generated asynchronously immediately upon plugin activation.
  • FIX: Corrected a CSS alignment issue by adding proper padding to the “File Status” sidebar widget.
  • ENHANCEMENT: Default settings on a fresh installation are now more robust. The plugin defaults to using the XML sitemap as the primary URL source.
  • ENHANCEMENT: The “Attribution & Use Policy” field is now pre-filled with a helpful default template.
  • TWEAK: All AI crawlers are now enabled by default on new installations for better out-of-the-box functionality.
  • TWEAK: The admin contact email is now automatically populated in the “Contact Information” field on initial setup.
  • TWEAK: Removed the “(recommended)” text from the “Block all unlisted bots by default” label for a cleaner user interface.

5.0.0

  • FEATURE: Major UI Overhaul: Settings are now organised into a clean, tabbed interface (General Rules, Content Sourcing, Tools, Live Preview).
  • FEATURE: Granular Crawler Rules: You can now add specific Disallow rules for each individual AI crawler, offering fine-tuned control.
  • FEATURE: Settings Import & Export: Easily back up, restore, and migrate your complete plugin settings using a JSON file from the new Tools tab.
  • FEATURE: Advanced Scheduling: Choose to regenerate the file on save, manually, or on a recurring schedule (e.g., daily, weekly) via a new setting.
  • FEATURE: AJAX Post Exclusion Search: Replaced the manual ID input with a user-friendly AJAX search box to find and exclude posts/pages by title.
  • FEATURE: “Block All by Default” Mode: Added an option to create a secure “whitelist” by blocking all unlisted crawlers.

6.1.14

  • FEATURE: Manual Regeneration Button: A new “Regenerate Now” button on the Tools tab triggers file generation instantly via AJAX.
  • UI: The Live Preview now features syntax highlighting for improved readability of rules, comments, and URLs.
  • TWEAK: The generation summary notice now provides a detailed breakdown of URLs sourced and duplicates removed.
  • TWEAK: The admin sidebar status box now shows the next scheduled regeneration time.

4.0.0

  • FEATURE: AJAX-powered Live Preview to see how settings changes will affect the file output without needing to save.
  • FEATURE: Expanded Content Sourcing to include Taxonomies (e.g., categories, tags) and Author Archives.
  • FEATURE: Added a Sitemap: directive to the generated file for better crawler discovery if a sitemap URL is provided.
  • FEATURE: Added a filter odyssey_llms_default_rawlers to allow developers to extend the default crawler list.
  • UI: Implemented conditional JavaScript logic to show/hide settings based on the selected URL source method for a cleaner interface.
  • TWEAK: Heavily refactored the file generation logic into smaller, more maintainable functions for improved reliability.
  • TWEAK: Improved the XML Sitemap parser with better error handling and an increased timeout for large sites.
  • TWEAK: Enhanced admin notices to provide more specific feedback on the success or failure of file generation.

3.0.0

  • FEATURE: Major UI/UX overhaul with a modern, polished dashboard design.
  • FEATURE: The plugin now has its own top-level menu item in the admin sidebar.
  • FEATURE: Settings are now organised into a card-based layout with a main content area and a sidebar.
  • UI: The sidebar now includes an “at-a-glance” status box showing file status and last update time.

2.2.1

  • FEATURE: Added a custom “View Details” link on the plugins page.

2.2.0

  • FEATURE: Added a “Reset to Defaults” button.
  • FEATURE: Added a “View Live File” link.
  • FEATURE: Implemented asynchronous (background) file generation using WP Cron to prevent server timeouts.

2.1.0

  • FEATURE: Plugin now works “out-of-the-box” by setting smart, comprehensive defaults on activation.
  • FEATURE: Added a server permission check to warn users if the root directory is not writable.

2.0.0

  • FEATURE: Revamped llms.txt format with structured sections and added a checklist of common AI crawlers.
  • FEATURE: Added Sitemap Integration as a URL source option.

1.0.0

  • Initial release.