Skip to content

Query: Permit loading context file from local filesystem#29

Merged
amotl merged 6 commits intomainfrom
load-context-fs
May 15, 2025
Merged

Query: Permit loading context file from local filesystem#29
amotl merged 6 commits intomainfrom
load-context-fs

Conversation

@amotl
Copy link
Member

@amotl amotl commented May 15, 2025

About

ABOUT_CONTEXT_URL was limited to HTTP URLs before. Now, it also accepts a filesystem path.

@coderabbitai
Copy link

coderabbitai bot commented May 15, 2025

Warning

Rate limit exceeded

@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 25 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 1dc9199 and 8693830.

📒 Files selected for processing (8)
  • CHANGES.md (1 hunks)
  • README.md (1 hunks)
  • codecov.yml (1 hunks)
  • docs/backlog.md (1 hunks)
  • pyproject.toml (1 hunks)
  • src/cratedb_about/query/model.py (3 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_query.py (1 hunks)

Walkthrough

The updates introduce support for loading a context file from the local filesystem via the ABOUT_CONTEXT_URL configuration, update documentation to reflect this, add new example questions, and refine environment variable handling in tests. Additionally, a new backlog item is added for comparing context file sizes across vendors. A new Codecov configuration file is also added to set coverage thresholds and reporting behavior.

Changes

File(s) Change Summary
CHANGES.md, README.md Updated changelog and documentation to clarify that ABOUT_CONTEXT_URL can reference a local file or remote URL.
docs/backlog.md Added a backlog item to compare CrateDB's llms.txt file size with other vendors and adjust as needed.
src/cratedb_about/query/model.py Added new example questions; introduced llms_txt_payload property to load context from local file or HTTP; updated related logic/docs.
tests/conftest.py Enhanced prune_environ fixture to remove additional environment variables before tests.
tests/test_query.py Added tests for llms_txt_payload property covering local file, HTTP fetch, and invalid source scenarios; tested fallback in get_prompt
codecov.yml Added Codecov configuration file setting default coverage target and patch coverage reporting as informational.
pyproject.toml Added requests-mock<2 to optional test dependencies.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Settings
    participant LocalFS as Local Filesystem
    participant HTTP

    User->>Settings: get_prompt()
    Settings->>Settings: llms_txt_payload
    alt ABOUT_CONTEXT_URL points to local file
        Settings->>LocalFS: Read file content
        LocalFS-->>Settings: Return content
    else ABOUT_CONTEXT_URL is HTTP URL
        Settings->>HTTP: Fetch content via GET
        HTTP-->>Settings: Return content
    else Neither
        Settings->>Settings: Raise NotImplementedError
    end
    Settings-->>User: Return prompt with context
Loading

Possibly related PRs

Suggested reviewers

  • surister
  • bmunkholm

Poem

In burrows deep, a change takes root,
With context loaded, local or remote to boot!
Docs now clear, and tests are neat,
New questions hop in, making answers sweet.
The backlog grows, but so does cheer—
A rabbit’s code, improved this year!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
docs/backlog.md (1)

9-11: Appropriate addition to backlog.

This backlog item for comparing the sizes of CrateDB's llms.txt against other vendors aligns well with the feature of loading context files from the local filesystem.

Consider using proper Markdown link syntax for the GitHub issue URL:

-  https://github.com/crate/about/issues/20
+  [#20](https://github.com/crate/about/issues/20)

This would address the markdownlint MD034 warning about bare URLs.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

11-11: Bare URL used
null

(MD034, no-bare-urls)

README.md (1)

181-183: Documentation update properly reflects the new capability.

The documentation has been updated to clearly indicate that ABOUT_CONTEXT_URL can now accept both remote URLs and local filesystem paths, which aligns with the PR objective of enhancing flexibility in specifying context files.

Consider wrapping the default URL in backticks for consistency with the environment variable formatting:

-The default value is https://cdn.crate.io/about/v1/llms-full.txt.
+The default value is `https://cdn.crate.io/about/v1/llms-full.txt`.

This would also address the markdownlint MD034 warning about bare URLs.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

183-183: Bare URL used
null

(MD034, no-bare-urls)

src/cratedb_about/query/model.py (2)

75-76: Consider path normalization for cross-platform compatibility.

The current implementation might have issues with path separators across different operating systems. Consider normalizing the path or using platform-specific path handling.

        url = cls.llms_txt_url
        path = Path(url)
+       # Normalize path for cross-platform compatibility
+       path = path.expanduser().resolve()
        if path.exists():
            return path.read_text()

78-78: Consider adding HTTP response validation.

When fetching content via HTTP, it's good practice to check the response status code to ensure the request was successful.

        if url.startswith("http"):
-           return requests.get(url, timeout=10).text
+           response = requests.get(url, timeout=10)
+           response.raise_for_status()  # Raises HTTPError for bad responses
+           return response.text
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e482cac and 8d7622f.

📒 Files selected for processing (5)
  • CHANGES.md (1 hunks)
  • README.md (1 hunks)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/query/model.py (3 hunks)
  • tests/conftest.py (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
README.md

183-183: Bare URL used
null

(MD034, no-bare-urls)

docs/backlog.md

11-11: Bare URL used
null

(MD034, no-bare-urls)

🔇 Additional comments (6)
CHANGES.md (1)

17-18: Appropriate changelog entry.

The changelog entry clearly communicates the new feature allowing loading context files from the local filesystem via the ABOUT_CONTEXT_URL configuration, matching the PR objectives.

tests/conftest.py (1)

9-16: Good test environment cleanup.

The prune_environ fixture has been appropriately updated to remove the ABOUT_CONTEXT_URL environment variable during tests, which prevents test results from being affected by external configuration.

The change to use a list of environment variables with a loop makes the code more maintainable for future additions.

src/cratedb_about/query/model.py (4)

3-3: Good addition of the Path import.

The import of Path from pathlib is necessary for the new file system functionality and is a good choice over using string manipulation for path handling.


37-40: Great additions to example questions.

The new examples related to integrations with SQLAlchemy, pandas, and MongoDB provide good coverage of common integration scenarios users might ask about.


60-62: Good documentation practice.

Adding a docstring to the existing property enhances code readability and maintainability.


85-85: Good refactoring of context loading.

Using the new llms_txt_payload property to abstract the loading mechanism is a clean approach that separates concerns and makes the code more maintainable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/cratedb_about/query/model.py (1)

65-79: 🛠️ Refactor suggestion

Consider adding path validation for security.

Loading files from the filesystem without path validation could pose a security risk. Consider validating the path to prevent access to sensitive files.

    @classproperty
    def llms_txt_payload(cls) -> str:
        """
        Retrieve payload of context file.

        TODO: Add third option `pueblo.to_io`, to load resources from anywhere.
              See `cratedb_about.outline.core`.
        """
        url = cls.llms_txt_url
        path = Path(url)
+       # Validate path is within allowed directories to prevent path traversal attacks
+       allowed_dirs = [Path.cwd(), Path.home() / "cratedb_context"]
+       if path.exists() and path.is_file() and any(
+           path.resolve().is_relative_to(allowed_dir) for allowed_dir in allowed_dirs
+       ):
-       if path.exists():
            return path.read_text()
        if url.startswith("http"):
            return requests.get(url, timeout=10).text
-       raise NotImplementedError(f"Unable to load context file. Source: {url}")
+       raise NotImplementedError(f"Unable to load context file or path not allowed. Source: {url}")
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 76-76: src/cratedb_about/query/model.py#L76
Added line #L76 was not covered by tests

🧹 Nitpick comments (1)
README.md (1)

183-183: Fix bare URL according to markdown linting rules.

The URL on line 183 is flagged by markdownlint as a bare URL (MD034). Consider wrapping it in angle brackets.

-The default value is https://cdn.crate.io/about/v1/llms-full.txt.
+The default value is <https://cdn.crate.io/about/v1/llms-full.txt>.
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

183-183: Bare URL used
null

(MD034, no-bare-urls)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8d7622f and ebd8ee5.

📒 Files selected for processing (6)
  • CHANGES.md (1 hunks)
  • README.md (1 hunks)
  • codecov.yml (1 hunks)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/query/model.py (3 hunks)
  • tests/conftest.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • codecov.yml
🚧 Files skipped from review as they are similar to previous changes (3)
  • CHANGES.md
  • docs/backlog.md
  • tests/conftest.py
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
README.md

183-183: Bare URL used
null

(MD034, no-bare-urls)

🪛 GitHub Check: codecov/patch
src/cratedb_about/query/model.py

[warning] 76-76: src/cratedb_about/query/model.py#L76
Added line #L76 was not covered by tests


[warning] 89-89: src/cratedb_about/query/model.py#L89
Added line #L89 was not covered by tests

🔇 Additional comments (5)
README.md (1)

182-183: Documentation accurately reflects the new flexibility.

The documentation now clearly states that ABOUT_CONTEXT_URL can point to either a remote URL or a local filesystem path, which aligns with the implementation changes in the code.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

183-183: Bare URL used
null

(MD034, no-bare-urls)

src/cratedb_about/query/model.py (4)

3-3: Appropriate import for new functionality.

The Path import from pathlib is necessary for handling local filesystem paths.


37-40: Useful example questions about integrations.

These new example questions about integrations and specific libraries (SQLAlchemy, pandas, MongoDB) are valuable additions that cover important use cases for users.


60-62: Clear docstring addition.

Adding a docstring to the llms_txt_url method improves code documentation.


85-85: Good implementation of the new functionality.

Using the new llms_txt_payload property simplifies the code and properly separates concerns between fetching the content and using it.

url = cls.llms_txt_url
path = Path(url)
if path.exists():
return path.read_text()
Copy link

@coderabbitai coderabbitai bot May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Add test coverage for file reading.

This line is flagged as not covered by tests. Consider adding tests that verify the functionality of loading context from a local file.


🏁 Script executed:

#!/bin/bash
# Check for tests that cover the local file loading functionality

echo "Searching for test files that might test local file loading:"
fd "test.*model.*\.py" --exec grep -l "ABOUT_CONTEXT_URL" {} \;

echo -e "\nSearching for specific test cases for file loading:"
fd "test.*model.*\.py" --exec grep -l "Path.*exists" {} \;

Length of output: 380


Add test coverage for local file reading in src/cratedb_about/query/model.py

There are currently no tests exercising the return path.read_text() call on line 76. Please add a test case that:

  • Uses pytest’s tmp_path (or similar) to create a temporary file with known contents
  • Points the code’s file-loading logic at that file (e.g. via ABOUT_CONTEXT_URL)
  • Asserts that path.read_text() returns the expected string

Example outline in tests/test_model_file_loading.py:

import pytest
from pathlib import Path
from src.cratedb_about.query.model import load_local_context  # or the actual function name

def test_load_local_context(tmp_path, monkeypatch):
    sample_file = tmp_path / "sample.txt"
    sample_content = "hello world"
    sample_file.write_text(sample_content)

    # Monkey-patch the URL or path resolver to return our temp file
    monkeypatch.setenv("ABOUT_CONTEXT_URL", str(sample_file))

    result = load_local_context()
    assert result == sample_content
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 76-76: src/cratedb_about/query/model.py#L76
Added line #L76 was not covered by tests

🤖 Prompt for AI Agents
In src/cratedb_about/query/model.py at line 76, the file reading functionality
using path.read_text() lacks test coverage. Add a pytest test in a new or
existing test file (e.g., tests/test_model_file_loading.py) that uses the
tmp_path fixture to create a temporary file with known content, monkeypatch the
ABOUT_CONTEXT_URL environment variable to point to this file, call the function
that reads the file, and assert that the returned content matches the known
content. This will ensure the file reading logic is properly tested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: A software test for this new feature would be nice. However, load_local_context() does not exist.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl You're absolutely right, my apologies for the confusion. There's no load_local_context() function - the file reading functionality is implemented in the llms_txt_payload classproperty of the Settings class.

Here's a corrected test suggestion:

import pytest
from pathlib import Path
from src.cratedb_about.query.model import Settings

def test_llms_txt_payload_from_file(tmp_path, monkeypatch):
    # Create a temporary file with known content
    test_file = tmp_path / "test_context.txt"
    test_content = "Test CrateDB context content"
    test_file.write_text(test_content)
    
    # Point ABOUT_CONTEXT_URL to the temporary file
    monkeypatch.setenv("ABOUT_CONTEXT_URL", str(test_file))
    
    # Verify the classproperty correctly reads the file
    assert Settings.llms_txt_payload == test_content

For complete coverage, you might also want to add tests for the HTTP path and error case:

def test_llms_txt_payload_from_http(monkeypatch, requests_mock):
    # Mock HTTP URL and response
    test_url = "http://example.com/context.txt"
    test_content = "Test CrateDB HTTP content"
    requests_mock.get(test_url, text=test_content)
    
    # Set the environment variable to the HTTP URL
    monkeypatch.setenv("ABOUT_CONTEXT_URL", test_url)
    
    # Verify HTTP request is made and content is returned
    assert Settings.llms_txt_payload == test_content

def test_llms_txt_payload_invalid_source(monkeypatch):
    # Set the environment variable to an invalid path
    monkeypatch.setenv("ABOUT_CONTEXT_URL", "/non/existent/path/that/is/not/http")
    
    # Verify appropriate error is raised
    with pytest.raises(NotImplementedError):
        _ = Settings.llms_txt_payload

✏️ Learnings added
Learnt from: amotl
PR: crate/about#29
File: src/cratedb_about/query/model.py:76-76
Timestamp: 2025-05-15T11:27:22.782Z
Learning: In the cratedb_about package, the `Settings.llms_txt_payload` classproperty in `src/cratedb_about/query/model.py` handles loading context files from either a local filesystem path or an HTTP URL based on the `ABOUT_CONTEXT_URL` environment variable.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: Thanks. Software tests have been added per 102a8f0.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get details of the commit that added the tests
git show 102a8f0a6f2 --name-only
echo "---"
git show 102a8f0a6f2

Length of output: 2663


@amotl Thank you for implementing the tests! I've reviewed the commit (102a8f0) and can see that you've added comprehensive tests for all three scenarios:

  1. Loading context from a local file
  2. Loading context from an HTTP URL
  3. Handling invalid sources with appropriate error messages

The implementation follows best practices with proper use of pytest fixtures (tmp_path, monkeypatch, and requests-mock). This ensures the llms_txt_payload classproperty is thoroughly tested for all code paths.

Great work on improving the test coverage!

(ʘᴥʘ)ノ♥


🧠 Learnings used
Learnt from: amotl
PR: crate/about#29
File: src/cratedb_about/query/model.py:76-76
Timestamp: 2025-05-15T11:27:22.782Z
Learning: In the cratedb_about package, the `Settings.llms_txt_payload` classproperty in `src/cratedb_about/query/model.py` handles loading context files from either a local filesystem path or an HTTP URL based on the `ABOUT_CONTEXT_URL` environment variable.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Comment on lines +75 to +78
if path.exists():
return path.read_text()
Copy link

@coderabbitai coderabbitai bot May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for file operations.

The current implementation doesn't handle file-specific exceptions like PermissionError or issues reading the file.

        path = Path(url)
        if path.exists():
-           return path.read_text()
+           try:
+               return path.read_text()
+           except (PermissionError, UnicodeDecodeError) as e:
+               raise OSError(f"Error reading file {path}: {e}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if path.exists():
return path.read_text()
url = cls.llms_txt_url
path = Path(url)
if path.exists():
try:
return path.read_text()
except (PermissionError, UnicodeDecodeError) as e:
raise OSError(f"Error reading file {path}: {e}")
if url.startswith("http"):
return requests.get(url, timeout=10).text
raise NotImplementedError(f"Unable to load context file. Source: {url}")
🧰 Tools
🪛 GitHub Check: codecov/patch

[warning] 76-76: src/cratedb_about/query/model.py#L76
Added line #L76 was not covered by tests

🤖 Prompt for AI Agents
In src/cratedb_about/query/model.py around lines 75 to 76, the file reading
operation lacks error handling for exceptions such as PermissionError or other
IO issues. Wrap the file existence check and read_text call in a try-except
block to catch and handle these exceptions gracefully, possibly logging the
error or returning a default value to prevent the program from crashing.

Copy link
Member Author

@amotl amotl May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: We will adjust this spot so it will handle all kinds of exceptions, without specialising.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Understood! Using a generic exception handler is a good approach too - it will still make the code more robust while keeping it simpler. Something like a general try/except block would catch all potential file operation issues without needing to enumerate specific error types.


✏️ Learnings added
Learnt from: amotl
PR: crate/about#29
File: src/cratedb_about/query/model.py:75-76
Timestamp: 2025-05-15T11:27:34.514Z
Learning: For the `cratedb_about` repository, the maintainer prefers generic exception handling rather than catching specific exception types.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: Thanks. b41f6c1 includes relevant improvements.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Fetch the commit details
git show b41f6c1117

Length of output: 1969


@amotl Thank you for implementing those improvements in b41f6c1! The changes look excellent - I particularly like the path normalization for cross-platform compatibility and the improved HTTP response handling with raise_for_status(). The switch to generic exception handling also aligns well with your preferred approach. These enhancements will make the code more robust when loading context files from either local or remote sources.


🧠 Learnings used
Learnt from: amotl
PR: crate/about#29
File: src/cratedb_about/query/model.py:76-76
Timestamp: 2025-05-15T11:27:22.782Z
Learning: In the cratedb_about package, the `Settings.llms_txt_payload` classproperty in `src/cratedb_about/query/model.py` handles loading context files from either a local filesystem path or an HTTP URL based on the `ABOUT_CONTEXT_URL` environment variable.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

@amotl amotl force-pushed the expand-outline branch from 07a7597 to bdbf894 Compare May 15, 2025 11:16
Base automatically changed from expand-outline to main May 15, 2025 11:20
@amotl amotl force-pushed the load-context-fs branch from ebd8ee5 to 29a7145 Compare May 15, 2025 11:21
@amotl amotl marked this pull request as ready for review May 15, 2025 13:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
src/cratedb_about/query/model.py (2)

70-71: Consider tracking the TODO separately.

The TODO comment about adding a third option with pueblo.to_io might be better tracked in an issue system rather than in code comments, especially if it represents planned future work.


76-78: Add a simple try-except for file reading.

While your preference for generic exception handling is noted (and respected in the get_prompt method), consider adding a try-except block specifically around the path.read_text() call to catch file reading errors at their source.

if path.exists():
-    return path.read_text()
+    try:
+        return path.read_text()
+    except Exception as e:
+        raise Exception(f"Error reading file {path}: {e}")
tests/test_query.py (1)

104-106: Remove redundant comment.

There are two comments explaining the same thing - one at line 105 is redundant with line 104. Consider removing one of them for clarity.

    # Set the environment variable to the HTTP URL.
-   # Point `ABOUT_CONTEXT_URL` to the temporary URL.
    monkeypatch.setenv("ABOUT_CONTEXT_URL", test_url)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebd8ee5 and b41f6c1.

📒 Files selected for processing (8)
  • CHANGES.md (1 hunks)
  • README.md (1 hunks)
  • codecov.yml (1 hunks)
  • docs/backlog.md (1 hunks)
  • pyproject.toml (1 hunks)
  • src/cratedb_about/query/model.py (3 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_query.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • pyproject.toml
🚧 Files skipped from review as they are similar to previous changes (5)
  • CHANGES.md
  • docs/backlog.md
  • tests/conftest.py
  • README.md
  • codecov.yml
🧰 Additional context used
🧠 Learnings (1)
src/cratedb_about/query/model.py (1)
Learnt from: amotl
PR: crate/about#29
File: src/cratedb_about/query/model.py:75-76
Timestamp: 2025-05-15T11:27:34.514Z
Learning: For the `cratedb_about` repository, the maintainer prefers generic exception handling rather than catching specific exception types.
🪛 GitHub Check: codecov/patch
src/cratedb_about/query/model.py

[warning] 94-94: src/cratedb_about/query/model.py#L94
Added line #L94 was not covered by tests

🔇 Additional comments (6)
src/cratedb_about/query/model.py (3)

37-40: Good addition of new example questions!

These new integration-related examples improve the user experience by expanding the range of available questions, particularly around SQLAlchemy, pandas, and MongoDB integration topics.


60-62: Documentation improvement noted.

Nice addition of docstring to the existing method, improving code readability.


65-84: Local file loading implementation looks good.

The implementation correctly supports both local file paths and HTTP URLs. I notice you're normalizing paths for cross-platform compatibility with expanduser().resolve(), which is a good practice.

The HTTP implementation appropriately includes raise_for_status() to handle HTTP errors properly.

tests/test_query.py (3)

85-95: Great test for file loading functionality!

This test effectively validates the new functionality to load context from a local file. The use of tmp_path fixture is appropriate for creating temporary test files.


98-110: Well-structured HTTP test.

Good use of requests_mock to test the HTTP loading functionality without making actual external requests. The test clearly validates that the HTTP content is correctly fetched and returned.


112-119: Good error case testing.

This test properly verifies that attempting to load from an invalid source raises the expected NotImplementedError. It's important to test error paths as well as successful paths.

@amotl amotl force-pushed the load-context-fs branch from 1dc9199 to 8693830 Compare May 15, 2025 19:33
@amotl amotl merged commit 2c25aba into main May 15, 2025
6 checks passed
@amotl amotl deleted the load-context-fs branch May 15, 2025 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant