Skip to content

CLI: Zap working tree building by establishing cratedb-about build#16

Merged
amotl merged 1 commit intomainfrom
build-builtin
May 9, 2025
Merged

CLI: Zap working tree building by establishing cratedb-about build#16
amotl merged 1 commit intomainfrom
build-builtin

Conversation

@amotl
Copy link
Member

@amotl amotl commented May 9, 2025

About

Build the llms.txt self-contained so it doesn't need the repository.

@coderabbitai
Copy link

coderabbitai bot commented May 9, 2025

Walkthrough

This update introduces a new build system for generating llms.txt files, including a Python builder class, CLI command, and related documentation and test changes. Dependency management is revised with new optional groups, and legacy build steps and tasks are removed from configuration files and workflow scripts. Documentation is updated to reflect these changes.

Changes

Files/Paths Change Summary
.github/workflows/tests.yml Modified pip install command to include [all] extras; removed the "Run build" (poe build) step.
pyproject.toml Updated dependencies, added optional groups (all, llm), removed old build task, updated package data.
CHANGES.md, README.md, docs/sandbox.md Updated changelog, installation, and usage instructions; documented new build process and removed references to old build commands.
src/cratedb_about/build/llmstxt.py Added new module with LllmsTxtBuilder dataclass for orchestrating the build of llms.txt and related files.
src/cratedb_about/cli.py Added build CLI command, improved type annotations, reorganized imports, and integrated new builder logic.
tests/test_cli.py Added test for the new build CLI command, checking for successful execution and expected log output.
tests/conftest.py Added pytest fixture to remove OUTDIR environment variable before tests run.
.gitignore Broadened coverage ignore pattern from .coverage to .coverage*.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI (cratedb-about)
    participant LllmsTxtBuilder
    participant FileSystem
    participant Shell

    User->>CLI (cratedb-about): cratedb-about build --outdir <path>
    CLI (cratedb-about)->>LllmsTxtBuilder: run(outdir)
    LllmsTxtBuilder->>FileSystem: Ensure output directory exists
    LllmsTxtBuilder->>FileSystem: Copy resource files (md, yaml)
    LllmsTxtBuilder->>Shell: Run "cratedb-about outline" to generate outline.md
    LllmsTxtBuilder->>Shell: Run "llms_txt2ctx" to generate llms.txt and llms-full.txt
    LllmsTxtBuilder->>CLI (cratedb-about): Log build progress
    CLI (cratedb-about)->>User: Print "Ready."
Loading

Possibly related PRs

Poem

🐇
A builder hops in, so spry and so neat,
With llms.txt magic, the build is complete!
New commands in the CLI, dependencies in tow,
Out with the old—let the new workflow flow!
In the warren of code, we’re ready to go!


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46b2951 and 738f525.

📒 Files selected for processing (10)
  • .github/workflows/tests.yml (1 hunks)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (2 hunks)
  • docs/sandbox.md (1 hunks)
  • pyproject.toml (2 hunks)
  • src/cratedb_about/build/llmstxt.py (1 hunks)
  • src/cratedb_about/cli.py (3 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_cli.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🚧 Files skipped from review as they are similar to previous changes (8)
  • docs/sandbox.md
  • CHANGES.md
  • README.md
  • .github/workflows/tests.yml
  • pyproject.toml
  • tests/test_cli.py
  • src/cratedb_about/cli.py
  • src/cratedb_about/build/llmstxt.py
🔇 Additional comments (1)
tests/conftest.py (1)

1-10: Clean implementation of test environment isolation.

This pytest fixture correctly prevents the "OUTDIR" environment variable from affecting tests by removing it at the session level. The implementation follows best practices with:

  • Appropriate session scope to efficiently handle the cleanup once per test session
  • Autouse flag to ensure consistent application across all tests without explicit calls
  • Clear docstring that explains the purpose
  • Safe use of pop with a default value to prevent KeyError exceptions

This supports the new build system functionality while maintaining test isolation.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
.github/workflows/tests.yml (1)

39-40: 💡 Verification agent

🧩 Analysis chain

Verify if 'sponge' is still needed in the workflow.

Since you've moved from the old build process to a new CLI command, is the sponge utility still required? If it was only used in the removed poe build task, consider removing this installation step.


🏁 Script executed:

#!/bin/bash
# Check if sponge is still used in the codebase

# Search for sponge usage in Python files
echo "Looking for sponge usage in Python files:"
rg "sponge" --type py

# Search for sponge usage in shell scripts or other files
echo "Looking for sponge usage in other files:"
rg "sponge" --type-not py

Length of output: 231


Remove unused sponge installation

No references to sponge were found anywhere in the codebase, so you can safely delete the installation step in your CI workflow.

• File: .github/workflows/tests.yml
Lines 39–40 – remove the following:

-      - name: Install `sponge`
-        run: sudo apt-get --yes install moreutils
🧹 Nitpick comments (4)
CHANGES.md (1)

9-10: Changelog entry needs a grammatical correction.

There's a minor grammar issue in the changelog entry.

- CLI: Zapped working tree building by establishing new `cratedb-about build`
+ CLI: Zapped working tree building by establishing a new `cratedb-about build`
🧰 Tools
🪛 LanguageTool

[uncategorized] ~9-~9: You might be missing the article “a” here.
Context: ...d working tree building by establishing new cratedb-about build subcommand ## ...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

tests/test_cli.py (2)

62-76: New test added for the 'build' command.

The test covers basic functionality by verifying exit code and expected log messages. Consider enhancing it to verify the actual output files are created correctly.

def test_cli_build(caplog, tmp_path):
    runner = CliRunner()

    result = runner.invoke(
        cli,
        args=["build"],
        env={"OUTDIR": str(tmp_path)},
        catch_exceptions=False,
    )
    assert result.exit_code == 0, result.output

    assert "Building llms-txt" in caplog.text
    assert "Dumping outline source file" in caplog.text
    assert "Generating llms-txt files" in caplog.text
    assert "Ready." in caplog.text
+    
+    # Verify that the expected output files are created
+    assert (tmp_path / "llms.txt").exists()
+    assert (tmp_path / "llms-full.txt").exists()

62-76: Consider adding tests for error scenarios.

The current test only covers the happy path. Consider adding tests for error cases, such as when OUTDIR is not set or points to a non-existent directory.

def test_cli_build_without_outdir():
    runner = CliRunner()

    # Test without OUTDIR environment variable
    result = runner.invoke(
        cli,
        args=["build"],
        env={},  # No OUTDIR set
        catch_exceptions=False,
    )
    # Verify appropriate error handling
    assert result.exit_code != 0
    assert "Error" in result.output or "OUTDIR" in result.output
src/cratedb_about/build/llmstxt.py (1)

20-43: Add error handling for build operations.

The current implementation doesn't handle potential errors during file operations or command execution. Adding try-except blocks would make the build process more robust.

def run(self):
    logger.info(f"Building llms-txt. Output directory: {self.outdir}")
-   self.outdir.mkdir(parents=True, exist_ok=True)
+   try:
+       self.outdir.mkdir(parents=True, exist_ok=True)
+   except Exception as e:
+       logger.error(f"Failed to create output directory: {e}")
+       raise

    logger.info("Copying source and documentation files")
-   shutil.copy(
-       str(resources.files("cratedb_about.build") / "llmstxt-about.md"),
-       self.outdir / "readme.md",
-   )
-   shutil.copy(
-       str(resources.files("cratedb_about.outline") / "cratedb-outline.yaml"),
-       self.outdir / "outline.yaml",
-   )
+   try:
+       shutil.copy(
+           str(resources.files("cratedb_about.build") / "llmstxt-about.md"),
+           self.outdir / "readme.md",
+       )
+       shutil.copy(
+           str(resources.files("cratedb_about.outline") / "cratedb-outline.yaml"),
+           self.outdir / "outline.yaml",
+       )
+   except Exception as e:
+       logger.error(f"Failed to copy resource files: {e}")
+       raise

    # ... similar error handling for command execution
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d0cac21 and 46b2951.

📒 Files selected for processing (8)
  • .github/workflows/tests.yml (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (2 hunks)
  • docs/sandbox.md (1 hunks)
  • pyproject.toml (2 hunks)
  • src/cratedb_about/build/llmstxt.py (1 hunks)
  • src/cratedb_about/cli.py (3 hunks)
  • tests/test_cli.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
CHANGES.md

[uncategorized] ~9-~9: You might be missing the article “a” here.
Context: ...d working tree building by establishing new cratedb-about build subcommand ## ...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

🔇 Additional comments (14)
docs/sandbox.md (1)

9-13: Documentation updated to reflect new build process correctly.

The instructions now clearly show the two-step process required to rebuild llms.txt files using the new CLI command, which aligns with the PR objective of making the build process self-contained.

.github/workflows/tests.yml (1)

58-58: Successfully updated dependencies to include 'all' group.

Adding the 'all' optional dependency group ensures that all required dependencies are installed for testing the new build command.

pyproject.toml (3)

74-77: Good dependency management update!

Adding colorlog for enhanced logging capabilities and specifying pueblo version will help maintain consistent behavior.


79-81: Well-structured dependency organization!

Moving LLM-related dependencies to optional groups makes the package more modular, allowing users to install only what they need. This aligns perfectly with the PR objective of making the llms.txt generation more self-contained.

Also applies to: 89-92


108-111: Good addition to package data!

Including *.md files alongside *.yaml ensures that all necessary resources are properly packaged, which supports the new self-contained build system.

README.md (2)

26-26: Good update to installation instructions!

The updated installation command correctly includes the [all] extra, which aligns with the dependency changes in pyproject.toml.


60-68: Clear documentation for new build functionality!

The new "llms-txt" subsection with build instructions clearly communicates how to use the new CLI command, making the documentation consistent with the code changes.

src/cratedb_about/build/llmstxt.py (2)

1-19: Good implementation of builder class structure!

The use of dataclasses and proper logging setup is good practice. The noqa comment for S605 is noted, as it's related to the shell commands used later.


20-33: Well-implemented resource management!

The creation of the output directory and copying of resource files is handled properly using appropriate methods from standard libraries.

src/cratedb_about/cli.py (5)

1-8: Good import organization!

The added imports for logging, Path, boot_click, and LllmsTxtBuilder are properly organized and necessary for the new functionality.


13-14: Good logger setup!

Adding a logger instance follows best practices for Python applications.


19-20: Good CLI initialization!

Replacing the no-op pass with boot_click provides better CLI initialization.


44-53: Well-implemented build command!

The new build command is well-structured with:

  • A clear docstring
  • Proper environment variable support via envvar="OUTDIR"
  • Required output directory parameter
  • Simple integration with the LllmsTxtBuilder class
  • Appropriate logging

This implementation achieves the PR's objective of making the llms.txt building process a self-contained CLI command.


27-27: Good type annotation additions!

Adding return type annotations to all command functions improves code quality and maintainability.

Also applies to: 58-58, 82-82

@amotl amotl force-pushed the build-builtin branch from 46b2951 to 0a7d49d Compare May 9, 2025 22:44
@amotl amotl force-pushed the build-builtin branch from 0a7d49d to 738f525 Compare May 9, 2025 22:46
@amotl amotl marked this pull request as ready for review May 9, 2025 22:56
@amotl amotl merged commit ff01ba8 into main May 9, 2025
3 checks passed
@amotl amotl deleted the build-builtin branch May 9, 2025 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant