Skip to content

Outline: Add to_llms_txt API method and --format=llms-txt CLI option#23

Merged
amotl merged 4 commits intomainfrom
format-llms-txt
May 15, 2025
Merged

Outline: Add to_llms_txt API method and --format=llms-txt CLI option#23
amotl merged 4 commits intomainfrom
format-llms-txt

Conversation

@amotl
Copy link
Member

@amotl amotl commented May 11, 2025

About

Provide a new output option --format=llms-txt for the cratedb-about outline subcommand to directly convert/expand the source outline file into an llms.txt file. It is the same like invoking the llms_txt2ctx program manually.
Along the lines, also provide the same functionality per compact Python API.

Synopsis

CLI

cratedb-about outline --format=llms-txt [--optional]

API

from cratedb_about import CrateDbKnowledgeOutline

# Load information from the built-in YAML file.
outline = CrateDbKnowledgeOutline.load()

# Convert outline into llms-txt format (full).
outline.to_llms_txt(optional=True)

@coderabbitai
Copy link

coderabbitai bot commented May 11, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

The changes introduce a new "llms-txt" output format for the Outline component, adding corresponding API and CLI options. Documentation and usage examples are updated to reflect the new format. Minor link and formatting corrections are made, and a new backlog item is added regarding future CLI subcommand naming. The build process for llms.txt files is refactored to generate files directly without subprocess calls.

Changes

File(s) Change Summary
CHANGES.md Updated changelog to document the addition of the to_llms_txt API and --format=llms-txt CLI.
README.md Updated documentation to use "llms-txt" consistently, added usage examples, and fixed links.
docs/backlog.md Added backlog item to rename the build subcommand for llms-txt context.
src/cratedb_about/cli.py Added "llms-txt" as an output format, new --optional flag, and updated function signature.
src/cratedb_about/outline/cratedb-outline.yaml Updated an example URL to point to a new README location.
src/cratedb_about/outline/model.py Added to_llms_txt method to OutlineDocument for generating "llms-txt" output.
src/cratedb_about/build/llmstxt.py Refactored builder to generate llms.txt files directly from the outline object, removing subprocess calls.
tests/test_cli.py Removed assertion on a specific log message; reworded comments for clarity.
tests/test_outline.py Updated test outline file path constant; added CLI tests for JSON, YAML, and llms-txt formats.
tests/assets/outline.yaml Added new "Optional" section with an example domain reference.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant OutlineDocument
    participant llms_txt

    User->>CLI: cratedb-about outline --format=llms-txt [--optional]
    CLI->>OutlineDocument: to_llms_txt(optional)
    OutlineDocument->>OutlineDocument: to_markdown()
    OutlineDocument->>llms_txt: create_ctx(markdown, optional)
    llms_txt-->>OutlineDocument: context object
    OutlineDocument-->>CLI: context as string
    CLI-->>User: Output llms-txt format
Loading

Possibly related PRs

Suggested reviewers

  • bmunkholm

Poem

In the garden of code, a new path unfurled,
"llms-txt" now dances, its banners unfurled.
CLI and docs in harmony sing,
Outlines transformed with a magical spring.
With every new format, the warren grows bright—
🐇 Hopping through features, coding delight!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 939b770 and a663ef2.

📒 Files selected for processing (10)
  • CHANGES.md (1 hunks)
  • README.md (3 hunks)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/build/llmstxt.py (2 hunks)
  • src/cratedb_about/cli.py (2 hunks)
  • src/cratedb_about/outline/cratedb-outline.yaml (1 hunks)
  • src/cratedb_about/outline/model.py (1 hunks)
  • tests/assets/outline.yaml (1 hunks)
  • tests/test_cli.py (1 hunks)
  • tests/test_outline.py (2 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5537e99 and 5d39315.

📒 Files selected for processing (6)
  • CHANGES.md (1 hunks)
  • README.md (5 hunks)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/cli.py (2 hunks)
  • src/cratedb_about/outline/cratedb-outline.yaml (1 hunks)
  • src/cratedb_about/outline/model.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/cratedb_about/cli.py (1)
src/cratedb_about/outline/model.py (1)
  • to_llms_txt (73-76)
🪛 LanguageTool
CHANGES.md

[uncategorized] ~7-~7: You might be missing the article “the” here.
Context: ... environment variable. - Outline: Added to_llms_txt API method and `--format=ll...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🪛 GitHub Check: codecov/patch
src/cratedb_about/outline/model.py

[warning] 74-76: src/cratedb_about/outline/model.py#L74-L76
Added lines #L74 - L76 were not covered by tests

src/cratedb_about/cli.py

[warning] 64-65: src/cratedb_about/cli.py#L64-L65
Added lines #L64 - L65 were not covered by tests

🔇 Additional comments (16)
src/cratedb_about/outline/model.py (1)

6-6: LGTM! Import added for the new functionality.

The new import of create_ctx from llms_txt is necessary for the added conversion functionality.

src/cratedb_about/cli.py (5)

34-40: LGTM! Format option updated to include the new format.

The format option has been correctly updated to include the new "llms-txt" format.


41-46: LGTM! New option added for controlling optional sections.

The new --optional flag has been added with a clear description of its purpose for the llms-txt format.


47-51: LGTM! Function signature updated correctly.

The function signature has been properly updated to include the new optional parameter with a sensible default value of False.


55-55: LGTM! Documentation updated to include the new format.

The docstring has been updated to mention the new llms-txt format.


64-65: Add test coverage for the new format handling.

The implementation for handling the "llms-txt" format is correct, but static analysis indicates it lacks test coverage.

Please add tests to verify that the CLI correctly handles the new format and optional flag. Consider testing both with and without the optional flag to ensure both code paths are covered.

🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 64-65: src/cratedb_about/cli.py#L64-L65
Added lines #L64 - L65 were not covered by tests

docs/backlog.md (1)

14-14: LGTM! Good addition to the backlog.

The new backlog item to rename the build subcommand to llms-txt build is a sensible addition to improve naming consistency, especially now that llms-txt is a formal format option.

src/cratedb_about/outline/cratedb-outline.yaml (1)

219-219: LGTM! URL updated to point to the correct README.

The URL has been updated to point to the README inside the app subdirectory, which is likely more relevant given the context.

README.md (8)

36-38: Docs updated for new llms-txt output
The example list now correctly references [llms-txt] alongside other formats, reflecting the new output option.


46-46: CLI description includes llms-txt context
The ask subcommand documentation now references the new [llms-txt] context files. This aligns with the added format support.


95-96: Quoted format argument for consistency
The CLI example for Markdown output now wraps the format in quotes, matching other examples.


97-98: Added example for llms-txt format
The new prose and flag example correctly illustrate how to use --format="llms-txt".


99-100: Shell example for llms-txt output
The code block demonstrates generating llms.txt in the new format.


101-104: Optional flag usage documented
The snippet shows how to include the optional content (--optional), producing llms-full.txt.


137-143: Python API updated with to_llms_txt calls
Both medium and full usage examples for the new to_llms_txt method are correctly shown.


204-204:

Details

✅ Verification successful

Verify the [llms-txt] link target
Please confirm that https://llmstxt.org/ is the correct and reachable URL for the llms-txt specification.


🏁 Script executed:

#!/bin/bash
# Check that the llms-txt spec URL returns HTTP 200
if ! curl -sSf -o /dev/null https://llmstxt.org/; then
  echo "ERROR: https://llmstxt.org/ is unreachable or returned a non-2xx status"
  exit 1
fi
echo "Link https://llmstxt.org/ is valid"

Length of output: 124


Link [llms-txt] Verified
The URL https://llmstxt.org/ returns HTTP 200 and is reachable—no changes needed.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d39315 and 038033c.

📒 Files selected for processing (9)
  • CHANGES.md (1 hunks)
  • README.md (3 hunks)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/build/llmstxt.py (2 hunks)
  • src/cratedb_about/cli.py (2 hunks)
  • src/cratedb_about/outline/cratedb-outline.yaml (1 hunks)
  • src/cratedb_about/outline/model.py (1 hunks)
  • tests/test_cli.py (1 hunks)
  • tests/test_outline.py (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • tests/test_outline.py
  • tests/test_cli.py
  • src/cratedb_about/outline/cratedb-outline.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
  • docs/backlog.md
  • src/cratedb_about/outline/model.py
  • CHANGES.md
  • README.md
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/cratedb_about/cli.py (1)
src/cratedb_about/outline/model.py (1)
  • to_llms_txt (72-88)
🪛 GitHub Check: codecov/patch
src/cratedb_about/cli.py

[warning] 64-65: src/cratedb_about/cli.py#L64-L65
Added lines #L64 - L65 were not covered by tests

🔇 Additional comments (8)
src/cratedb_about/build/llmstxt.py (3)

8-9: Import change looks good

The import of CrateDbKnowledgeOutline is necessary for the new approach to directly build llms.txt files without subprocess calls.


35-37: Good TODO comment for future optimization

The comment correctly identifies potential redundancy in the data processing for the two steps. This is a valid concern for future optimization.


38-40: Clean refactoring to use the API directly

This is a good refactoring that:

  1. Eliminates subprocess calls (more efficient and less error-prone)
  2. Uses the new to_llms_txt() API method directly
  3. Properly handles both regular and full output variants

The code is now more maintainable and aligns well with the PR objectives.

src/cratedb_about/cli.py (5)

31-32: Fixed help text typo

Good correction from "builtin" to "built-in" in the help text.


34-40: Good implementation of the new format option

The addition of "llms-txt" to the format choices is well-implemented and follows the existing pattern.


41-46: Well-documented optional flag

The new --optional flag is properly implemented with clear help text explaining its purpose.


47-51: Function signature updated appropriately

The outline function signature has been correctly updated to include the new format option and optional parameter.


55-56: Updated docstring for consistency

The docstring correctly lists all available output formats, including the new llms-txt format.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGES.md (1)

7-9: Nitpick: Improve clarity and consistency
Consider adding “the” before “CLI options” and including parentheses to denote the method. You may also merge into a single line for readability.

- - Outline: Added `to_llms_txt` API method and CLI options `--format=llms-txt`
-   and `--optional`, to directly convert/expand the source outline file into
-   an `llms.txt` file.
+ - Outline: Added the `to_llms_txt()` API method and the CLI options `--format=llms-txt` and `--optional` to directly convert or expand the source outline file into an `llms.txt` file.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~7-~7: You might be missing the article “the” here.
Context: ... environment variable. - Outline: Added to_llms_txt API method and CLI options ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 038033c and 8883b5e.

📒 Files selected for processing (8)
  • CHANGES.md (1 hunks)
  • README.md (3 hunks)
  • src/cratedb_about/build/llmstxt.py (2 hunks)
  • src/cratedb_about/cli.py (2 hunks)
  • src/cratedb_about/outline/model.py (1 hunks)
  • tests/assets/outline.yaml (1 hunks)
  • tests/test_cli.py (1 hunks)
  • tests/test_outline.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/assets/outline.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • src/cratedb_about/outline/model.py
  • README.md
  • tests/test_outline.py
  • tests/test_cli.py
  • src/cratedb_about/build/llmstxt.py
  • src/cratedb_about/cli.py
🧰 Additional context used
🪛 LanguageTool
CHANGES.md

[uncategorized] ~7-~7: You might be missing the article “the” here.
Context: ... environment variable. - Outline: Added to_llms_txt API method and CLI options ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

🔇 Additional comments (1)
CHANGES.md (1)

7-9: Changelog entry accurately documents new API and flags
The entry now correctly includes both the to_llms_txt API method and the --format=llms-txt and --optional CLI options, matching the PR objectives.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~7-~7: You might be missing the article “the” here.
Context: ... environment variable. - Outline: Added to_llms_txt API method and CLI options ...

(AI_EN_LECTOR_MISSING_DETERMINER_THE)

@amotl amotl requested review from kneth and surister May 11, 2025 19:10
@amotl amotl marked this pull request as ready for review May 11, 2025 19:10
def outline(url: str, format_: t.Literal["markdown", "yaml", "json"] = "markdown") -> None:
@click.option(
"--optional",
"-o",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if -o can viewed as an alias --output. Maybe -O is a better alias for --optional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks. I've removed the alias completely per 939b770.

Base automatically changed from improve-readme to main May 15, 2025 11:13
amotl added 4 commits May 15, 2025 13:13
Use this output format to directly convert/expand the source outline
file into an `llms.txt` file. It is the same like invoking the
`llms_txt2ctx` program manually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants