Skip to content

Build llms.txt files from source cratedb-overview.md#3

Merged
amotl merged 5 commits intomainfrom
build
Apr 16, 2025
Merged

Build llms.txt files from source cratedb-overview.md#3
amotl merged 5 commits intomainfrom
build

Conversation

@amotl
Copy link
Member

@amotl amotl commented Apr 16, 2025

About

This patch follows the llms.txt proposal about how FastHTML is doing it on their documentation 1:1.

Details

The FastHTML project follows these two proposals for its documentation. For instance, here is the FastHTML docs llms.txt. And here is an example of a regular HTML docs page, along with exact same URL but with a .md extension.

The FastHTML project opted to automatically expand the llms.txt to two markdown files with the contents of the linked URLs, using an XML-based structure suitable for use in LLMs such as Claude. The two files are: llms-ctx.txt, which does not include the optional URLs, and llms-ctx-full.txt, which does include them. They are created using the llms_txt2ctx command line application.

@coderabbitai
Copy link

coderabbitai bot commented Apr 16, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

The updates introduce new documentation and configuration files, expand usage instructions, and refine ignore patterns for development artifacts. A comprehensive overview document for CrateDB is added, serving as a centralized knowledge base with links to official resources, examples, and advanced topics. The project now includes a pyproject.toml for configuration, dependency management, and task automation. The .gitignore is updated to better match common Python project conventions. The README gains a usage section detailing build instructions, while the changelog records the initial project setup and build process. Minor improvements and link corrections are made in the CrateDB overview markdown.

Changes

File(s) Change Summary
.gitignore Changed .idea ignore pattern from absolute to relative; added patterns to ignore .venv*, *.egg-info, and *.lock files.
CHANGES.md Added a changelog entry documenting the initial project layout, source files, build artifacts, and generator program wrapper.
README.md Added a "Usage" section with instructions for rebuilding llms.txt files using uv run poe build.
build/llm/llms.txt Added a new documentation file providing an overview of CrateDB, including architecture, deployment options, resource links, API references, and example projects.
pyproject.toml Introduced project configuration: build system, dependencies, tool settings (ruff, mypy, versioningit), and task automation with poethepoet.
src/index/cratedb-overview.md Consolidated introductory text, updated documentation URLs, moved Java note to "Things to remember," and made minor formatting adjustments.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ProjectRepo
    participant BuildSystem
    participant Docs

    User->>ProjectRepo: Clone repository
    User->>BuildSystem: Run 'uv run poe build'
    BuildSystem->>Docs: Copy and generate documentation files (llms.txt, llms-ctx.txt)
    Docs-->>User: Provide CrateDB overview and resources
Loading

Possibly related PRs

Poem

In the warren, docs grow bright,
With tasks and configs set just right.
New guides for CrateDB appear,
While .gitignore keeps things clear.
Build commands hop into view,
And changelogs mark what’s new.
🐇 The project leaps ahead—cheer anew!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6876fed and 8d33532.

📒 Files selected for processing (6)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (1 hunks)
  • build/llm/llms.txt (1 hunks)
  • pyproject.toml (1 hunks)
  • src/index/cratedb-overview.md (4 hunks)

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
build/llm/llms.txt (1)

1-65: Comprehensive and well-structured documentation—approved.

The file accurately summarizes CrateDB's architecture, features, and resources, following the llms.txt proposal. The organization and clarity are excellent for both LLM and human readers.

Optional: Static analysis flagged repeated sentence beginnings in some sections (e.g., "Three successive sentences begin with the same word"). This is stylistic and common in list-based docs, so no action is required unless you want to further polish for human readers.

🧰 Tools
🪛 LanguageTool

[style] ~22-~22: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...All features of CrateDB at a glance. - [CrateDB reference documentation](https://crated...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~23-~23: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... reference documentation of CrateDB. - [CrateDB Cloud](https://cratedb.com/docs/cloud/e...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~33-~33: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... all SQL functions CrateDB provides. - [CrateDB drivers](https://cratedb.com/docs/crate...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~34-~34: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...using traditional database drivers. - [CrateDB cluster-wide settings](https://cratedb....

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~41-~41: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...r storage and analysis with CrateDB. - [CrateDB Offshore Wind Farms Demo Application](h...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~42-~42: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...a from the UK's offshore wind farms. - [CrateDB RAG / Hybrid Search PDF Chatbot](https:...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~43-~43: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... techniques and data from PDF files. - [CrateDB Geospatial Data Demo](https://github.co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae84729 and 6876fed.

📒 Files selected for processing (6)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (1 hunks)
  • build/llm/llms.txt (1 hunks)
  • pyproject.toml (1 hunks)
  • src/index/cratedb-overview.md (4 hunks)
🧰 Additional context used
🪛 LanguageTool
src/index/cratedb-overview.md

[style] ~22-~22: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...All features of CrateDB at a glance. - [CrateDB reference documentation](https://crated...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~33-~33: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... all SQL functions CrateDB provides. - [CrateDB drivers](https://cratedb.com/docs/crate...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~34-~34: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...using traditional database drivers. - [CrateDB cluster-wide settings](https://cratedb....

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

build/llm/llms.txt

[style] ~22-~22: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...All features of CrateDB at a glance. - [CrateDB reference documentation](https://crated...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~23-~23: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... reference documentation of CrateDB. - [CrateDB Cloud](https://cratedb.com/docs/cloud/e...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~33-~33: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... all SQL functions CrateDB provides. - [CrateDB drivers](https://cratedb.com/docs/crate...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~34-~34: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...using traditional database drivers. - [CrateDB cluster-wide settings](https://cratedb....

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~41-~41: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...r storage and analysis with CrateDB. - [CrateDB Offshore Wind Farms Demo Application](h...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~42-~42: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...a from the UK's offshore wind farms. - [CrateDB RAG / Hybrid Search PDF Chatbot](https:...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~43-~43: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... techniques and data from PDF files. - [CrateDB Geospatial Data Demo](https://github.co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (5)
.gitignore (1)

1-4: Standard Python ignore patterns added.

These ignore patterns are appropriate for a Python project and help keep development artifacts out of version control.

pyproject.toml (1)

1-164: Comprehensive and modern Python project configuration.

The pyproject.toml is well-structured, covers all necessary tools, and automates the build process as described in the PR objectives.

README.md (1)

20-29: Clear usage instructions added.

The new usage section provides straightforward steps for rebuilding documentation artifacts, improving developer experience.

CHANGES.md (1)

1-6: Changelog accurately documents project setup.

The changelog provides a clear record of the foundational changes introduced in this PR.

src/index/cratedb-overview.md (1)

3-65: Documentation improved for clarity and accuracy.

The changes consolidate information, update links, and enhance the overall quality of the overview. The repeated sentence beginnings flagged by static analysis are minor and do not require immediate action.

🧰 Tools
🪛 LanguageTool

[style] ~22-~22: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...All features of CrateDB at a glance. - [CrateDB reference documentation](https://crated...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~23-~23: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... reference documentation of CrateDB. - [CrateDB Cloud](https://cratedb.com/docs/cloud/e...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~33-~33: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... all SQL functions CrateDB provides. - [CrateDB drivers](https://cratedb.com/docs/crate...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~34-~34: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...using traditional database drivers. - [CrateDB cluster-wide settings](https://cratedb....

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~41-~41: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...r storage and analysis with CrateDB. - [CrateDB Offshore Wind Farms Demo Application](h...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~42-~42: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...a from the UK's offshore wind farms. - [CrateDB RAG / Hybrid Search PDF Chatbot](https:...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~43-~43: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... techniques and data from PDF files. - [CrateDB Geospatial Data Demo](https://github.co...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

@amotl amotl marked this pull request as ready for review April 16, 2025 19:11
Base automatically changed from real to main April 16, 2025 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant