Skip to content

Query: Introduce caching for context payloads on HTTP remote URLs#31

Merged
amotl merged 2 commits intomainfrom
query-cached-context
May 15, 2025
Merged

Query: Introduce caching for context payloads on HTTP remote URLs#31
amotl merged 2 commits intomainfrom
query-cached-context

Conversation

@amotl
Copy link
Member

@amotl amotl commented May 15, 2025

About

Within the "Query" subsystem, cache remote resources for one hour (3600 seconds) to strike the balance between freshness and resource saving. Use the ABOUT_CACHE_TTL environment variable to reconfigure that value in seconds.

@coderabbitai
Copy link

coderabbitai bot commented May 15, 2025

Warning

Rate limit exceeded

@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 3 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between f937b4c and 919362e.

📒 Files selected for processing (8)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (2 hunks)
  • docs/backlog.md (1 hunks)
  • pyproject.toml (1 hunks)
  • src/cratedb_about/query/core.py (5 hunks)
  • src/cratedb_about/query/model.py (2 hunks)
  • tests/test_query.py (2 hunks)

"""

Walkthrough

The changes refactor the context loading mechanism for CrateDB's query system by replacing the static Settings class with a configurable, instance-based KnowledgeContextLoader and its subclass CrateDbKnowledgeContextLoader. HTTP fetching now uses the hishel library for caching. Documentation, changelog, and tests are updated to reflect these changes.

Changes

File(s) Change Summary
.gitignore Added *.sqlite pattern to ignore SQLite database files.
CHANGES.md Added changelog entry for context payload caching from HTTP remote URLs.
README.md Reorganized CLI and Python API usage instructions; clarified and grouped configuration notes for context URL and cache TTL under a new "Notes" subsection. Removed redundant command snippet.
docs/backlog.md Added backlog note about heavy test_cli_bundle test and possible timeouts.
pyproject.toml Replaced dependency jaraco-classes<4 with hishel<0.2.
src/cratedb_about/query/core.py Replaced Settings usage with KnowledgeContextLoader/CrateDbKnowledgeContextLoader instance; added context field to CrateDbKnowledgeConversation; updated prompt and instructions access; removed conditional prompt appending in OpenAI backend.
src/cratedb_about/query/model.py Refactored Settings class to KnowledgeContextLoader and added CrateDbKnowledgeContextLoader subclass; introduced HTTP caching with hishel; enhanced error handling, logging, and configurability; removed deprecated class-level caching; improved prompt assembly logic.
tests/test_query.py Updated tests to use CrateDbKnowledgeContextLoader fixture; replaced static class checks with instance checks; updated mocking for HTTP and file sources; improved error handling verification and fallback behavior; added test for HTTP mocking (skipped).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CrateDbKnowledgeConversation
    participant CrateDbKnowledgeContextLoader
    participant HTTP/Filesystem

    User->>CrateDbKnowledgeConversation: Ask question (with use_knowledge)
    CrateDbKnowledgeConversation->>CrateDbKnowledgeContextLoader: get_prompt()
    alt Context URL is HTTP
        CrateDbKnowledgeContextLoader->>HTTP/Filesystem: Fetch context (with hishel cache)
    else Context file path
        CrateDbKnowledgeContextLoader->>HTTP/Filesystem: Read context file
    end
    CrateDbKnowledgeContextLoader-->>CrateDbKnowledgeConversation: Return prompt
    CrateDbKnowledgeConversation-->>User: Respond with answer using prompt
Loading

Possibly related PRs

Suggested reviewers

  • surister

Poem

In the warren of code, a loader anew,
Caches and fetches, with context in view.
No more static settings, but classes that gleam,
With hishel and logging, the queries now stream.
🐇✨
Refactored and tested, the changes are bright—
CrateDB's knowledge hops into the light!
"""

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
.gitignore (1)

7-7: Ensure cache file patterns are comprehensive.

Adding *.sqlite will ignore SQLite cache files created by hishel. Consider also ignoring any directory-based cache patterns (if applicable), such as a .cache/ folder or hishel-specific cache directories.

CHANGES.md (1)

19-19: Maintain formatting consistency.

The new entry under "Query" is essential, but please align its wrapping and indentation with neighboring entries for consistency (e.g., mirror the multiline style used above).

README.md (1)

167-170: Clarify CLI grouping for list-questions.

The list-questions snippet is useful but sits without a subheading. Consider adding a subheading (e.g., ##### List Questions) or grouping it under the existing CLI block to improve discoverability.

src/cratedb_about/query/core.py (1)

115-127: Double fetching of remote context – cache at call-site to avoid redundant I/O
self.context.get_prompt() is invoked for every single request and again in the Claude path. Although hishel caches HTTP responses, a local file read is still repeated and the property logic is re-executed. Storing the prompt once per CrateDbKnowledgeConversation instance avoids the extra work and simplifies testing.

-            try:
-                prompt = self.context.get_prompt()
-                input_messages.append(
+            try:
+                # Lazily cache prompt on the instance
+                if not hasattr(self, "_prompt"):
+                    self._prompt = self.context.get_prompt()
+                input_messages.append(
                     Message(
-                        content=[ResponseInputTextParam(text=prompt, type="input_text")],
+                        content=[ResponseInputTextParam(text=self._prompt, type="input_text")],
src/cratedb_about/query/model.py (1)

60-65: SQLite cache location not configurable – potential write-permission and concurrency issues
hishel.SQLiteStorage() creates hishel-cache.sqlite3 in the current working directory.
• In read-only deployments (e.g. Docker) this will raise OperationalError.
• Concurrent processes may corrupt the file (SQLite ≠ multi-writer friendly).

Expose a path or allow in-memory cache via env-var:

-storage = hishel.SQLiteStorage(ttl=self.cache_ttl)
+cache_path = os.getenv("ABOUT_CACHE_PATH", ":memory:")
+storage = hishel.SQLiteStorage(path=cache_path, ttl=self.cache_ttl)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1dc9199 and 9852846.

📒 Files selected for processing (8)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (2 hunks)
  • docs/backlog.md (1 hunks)
  • pyproject.toml (1 hunks)
  • src/cratedb_about/query/core.py (5 hunks)
  • src/cratedb_about/query/model.py (2 hunks)
  • tests/test_query.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/cratedb_about/query/model.py (1)
Learnt from: amotl
PR: crate/about#29
File: src/cratedb_about/query/model.py:76-76
Timestamp: 2025-05-15T11:27:22.793Z
Learning: In the cratedb_about package, the `Settings.llms_txt_payload` classproperty in `src/cratedb_about/query/model.py` handles loading context files from either a local filesystem path or an HTTP URL based on the `ABOUT_CONTEXT_URL` environment variable.
🧬 Code Graph Analysis (1)
src/cratedb_about/query/core.py (1)
src/cratedb_about/query/model.py (3)
  • ContextLoader (47-128)
  • CrateDbContextLoader (131-141)
  • get_prompt (117-128)
🔇 Additional comments (4)
pyproject.toml (1)

76-76:

Details

❓ Verification inconclusive

Verify the new dependency version.

The addition of hishel<0.2 is required to support HTTP caching, but please confirm that the version constraint aligns with the intended API (e.g., MockTransport and cache TTL configuration) and is available on PyPI.

Run this script to check the latest hishel version and ensure compatibility:


🏁 Script executed:

#!/usr/bin/env bash
# Check latest hishel version on PyPI
curl -s https://pypi.org/pypi/hishel/json | jq '.info.version'

Length of output: 70


Confirm hishel version constraint

  • The version constraint hishel<0.2 is valid: the latest release on PyPI is 0.1.2, which satisfies <0.2.
  • Please verify that version 0.1.2 provides the required HTTP-caching APIs (e.g. MockTransport and cache TTL configuration).
docs/backlog.md (1)

12-12: Approve backlog item addition.

The note about test_cli_bundle being heavy is a helpful reminder for future optimizations.

README.md (1)

179-185:

Details

✅ Verification successful

Verify environment variable documentation.

Please confirm that the documented default for ABOUT_CONTEXT_URL (https://cdn.crate.io/about/v1/llms-full.txt) and the default cache TTL (3600 seconds) match the actual defaults in CrateDbContextLoader. If there’s any mismatch, update the README accordingly.


🏁 Script executed:

#!/usr/bin/env bash
# Verify default context URL and TTL in code
grep -R "DEFAULT_CACHE_TTL" -n src/cratedb_about/query/model.py
grep -R "DEFAULT_CONTEXT_URL" -n src/cratedb_about/query/model.py

Length of output: 129


The previous grep didn’t locate those constants in model.py. Let’s instead find where CrateDbContextLoader is defined and inspect its default parameters:


🏁 Script executed:

#!/usr/bin/env bash
# Locate the loader class
rg -n "class.*CrateDbContextLoader" -n

# Search for context_url default
rg -n "context_url" -n src/cratedb_about

# Search for cache_ttl default
rg -n "cache_ttl" -n src/cratedb_about

Length of output: 1123


Defaults Verified – No Changes Needed

I’ve confirmed that in CrateDbContextLoader (src/cratedb_about/query/model.py):

  • context_url defaults to "https://cdn.crate.io/about/v1/llms-full.txt"
  • default_cache_ttl is 3600 seconds

These match the values documented in the README, so no updates are required.

src/cratedb_about/query/core.py (1)

156-157: Check compatibility of instructions= parameter with openai SDK
client.responses.create() only accepts system or instructions depending on the beta SDK version (0.4+). If the running version still expects system, the call will raise TypeError: got unexpected keyword argument 'instructions'. Please verify against the pinned openai version and consider a graceful fallback:

-    response = client.responses.create(
+    response = client.responses.create(
         model=model,
         reasoning=reasoning,
-        instructions=self.context.instructions,
+        system=self.context.instructions,           # fallback for older SDKs
         input=input_messages,  # type: ignore[arg-type]
     )

Comment on lines +84 to 91
chat = Chat(model, sp=self.context.instructions)
if self.use_knowledge:
try:
chat(Settings.get_prompt())
chat(self.context.get_prompt())
except Exception as e:
print(f"Warning: Failed to load knowledge context: {e}", file=sys.stderr) # noqa: T201
Copy link

@coderabbitai coderabbitai bot May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Chat(model, sp=...) very likely uses a wrong keyword argument
claudette.Chat’s constructor expects the system-prompt via the keyword argument system, not sp (latest 0.4.x API). Passing an unexpected kw-arg will raise TypeError at runtime, breaking every Claude request.

-chat = Chat(model, sp=self.context.instructions)
+chat = Chat(model, system=self.context.instructions)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
chat = Chat(model, sp=self.context.instructions)
if self.use_knowledge:
try:
chat(Settings.get_prompt())
chat(self.context.get_prompt())
except Exception as e:
print(f"Warning: Failed to load knowledge context: {e}", file=sys.stderr) # noqa: T201
chat = Chat(model, system=self.context.instructions)
if self.use_knowledge:
try:
chat(self.context.get_prompt())
except Exception as e:
print(f"Warning: Failed to load knowledge context: {e}", file=sys.stderr) # noqa: T201
🤖 Prompt for AI Agents
In src/cratedb_about/query/core.py around lines 84 to 89, the Chat constructor
is called with the keyword argument 'sp', which is incorrect. Replace
'sp=self.context.instructions' with 'system=self.context.instructions' to match
the expected parameter name in the claude.Chat API and prevent runtime
TypeErrors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: The code currently uses claudette version 0.1.8.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl Thank you for the clarification about using claudette version 0.1.8. You're absolutely right - in version 0.1.8, the Chat constructor uses the parameter name sp (rather than system which is used in newer versions like 0.4.x).

I apologize for the confusion in my previous comment. The current code is correct for the version you're using.


✏️ Learnings added
Learnt from: amotl
PR: crate/about#31
File: src/cratedb_about/query/core.py:86-91
Timestamp: 2025-05-15T16:51:02.000Z
Learning: In claudette version 0.1.8, the Chat constructor uses the parameter name `sp` for system prompts, not `system` which is used in newer versions (0.4.x+).

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines 103 to 135
@pytest.mark.skip(reason="Does not work after introducing Hishel yet. Why?")
def test_model_payload_from_http(monkeypatch):
# Mock HTTP URL and response.
test_url = "http://example.com/context.txt"
test_content = "Test CrateDB HTTP content"
requests_mock.get(test_url, text=test_content)
test_content = "Test HTTP content"

# Set the environment variable to the HTTP URL.
# Point `ABOUT_CONTEXT_URL` to the temporary URL.
# Point the environment variable `ABOUT_CONTEXT_URL` to the HTTP URL.
monkeypatch.setenv("ABOUT_CONTEXT_URL", test_url)

# Verify HTTP request is made and content is returned.
assert Settings.llms_txt_payload == test_content
# Verify the outcome.
with hishel.MockTransport() as transport:
transport.add_responses([httpx.Response(status_code=200, text=test_content)])
loader = CrateDbContextLoader()
result = loader.get_prompt()

assert loader.url == test_url
assert test_content in result
assert "necessary context" in result

Copy link

@coderabbitai coderabbitai bot May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

HTTP payload test skipped – consider hishel.MockTransport fix rather than blanket skip
The test became xfail after switching to Hishel. hishel supports injecting a custom httpx.MockTransport. Removing the skip would restore coverage of the critical HTTP branch:

@pytest.mark.usefixtures("monkeypatch")
def test_model_payload_from_http(monkeypatch):
    test_url = "http://example.com/context.txt"
    test_content = "Test HTTP content"

    monkeypatch.setenv("ABOUT_CONTEXT_URL", test_url)

    with hishel.MockTransport() as transport:
        transport.add_responses(
            [httpx.Response(200, text=test_content)]
        )
        loader = CrateDbContextLoader()
        result = loader.get_prompt()

    assert loader.url == test_url
    assert test_content in result
    assert "necessary context" in result
🤖 Prompt for AI Agents
In tests/test_query.py around lines 103 to 121, the test_model_payload_from_http
test is currently skipped due to issues with hishel.MockTransport after
switching to Hishel. To fix this, remove the skip decorator and ensure the test
uses hishel.MockTransport correctly by adding the HTTP response inside the with
block, then instantiate CrateDbContextLoader and call get_prompt within the same
block. Finally, move the assertions outside the with block to properly verify
the results and restore coverage of the HTTP payload loading branch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai: Unfortunately, the software test fails, that's why we needed to start skipping it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl I see the issue now - thank you for clarifying. Looking at the test more closely, I noticed the class name CrateDbContextLoader doesn't match the actual class CrateDbKnowledgeContextLoader used elsewhere, but that's likely not the only challenge.

Properly mocking HTTP clients with caching can be tricky, especially with libraries like Hishel that have their own testing patterns. It's completely reasonable to skip this test for now while focusing on delivering the primary functionality. Perhaps this could be revisited in a future PR specifically focused on test improvements.


✏️ Learnings added
Learnt from: amotl
PR: crate/about#31
File: tests/test_query.py:0-0
Timestamp: 2025-05-15T17:02:48.008Z
Learning: The CrateDB About package uses Hishel for HTTP caching in the `CrateDbKnowledgeContextLoader` class. Tests for HTTP functionality using `hishel.MockTransport` are currently skipped due to challenges with properly mocking the HTTP client.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/cratedb_about/query/model.py (2)

84-95: Add validation for negative TTL values

The cache_ttl method does not check for negative values, which could disable cache eviction. It's important to validate that the TTL is positive.


107-109: Add file check and explicit encoding when reading text files

The current implementation only checks if the path exists but doesn't verify if it's a file (not a directory). This could lead to IsADirectoryError when attempting to read a directory. Additionally, not specifying an encoding may cause locale-related issues.

tests/test_query.py (2)

142-151: Fix exception handling test by actually raising an exception

The current test is incorrect because it doesn't actually test exception handling. The lambda returns an Exception object rather than raising it, so the exception block in get_prompt() is never executed.


106-124: 🛠️ Refactor suggestion

Fix skipped HTTP test using MockTransport

The test for HTTP context loading is currently skipped. This reduces coverage for a critical functionality path. The test implementation looks correct with the MockTransport pattern, so the skip should be removed.

Remove the skip decorator to enable this test:

-@pytest.mark.skip(reason="Does not work after introducing Hishel yet. Why?")
 def test_model_payload_from_http(monkeypatch):
     # Mock HTTP URL and response.
     test_url = "http://example.com/context.txt"
     test_content = "Test HTTP content"

     # Point the environment variable `ABOUT_CONTEXT_URL` to the HTTP URL.
     monkeypatch.setenv("ABOUT_CONTEXT_URL", test_url)

     # Verify the outcome.
     with hishel.MockTransport() as transport:
         transport.add_responses([httpx.Response(status_code=200, text=test_content)])
         loader = CrateDbKnowledgeContextLoader()
         result = loader.get_prompt()

         assert loader.url == test_url
         assert test_content in result
         assert "necessary context" in result
🧹 Nitpick comments (1)
tests/test_query.py (1)

28-31: Add test for cache TTL environment variable override

The test suite verifies the URL environment variable override, but there's no equivalent test for the cache TTL environment variable. Consider adding a test case for ABOUT_CACHE_TTL override.

Add a test for the cache TTL environment variable:

def test_model_loader_cache_ttl_env(loader, mocker):
    mocker.patch.dict("os.environ", {"ABOUT_CACHE_TTL": "7200"})
    assert loader.cache_ttl == 7200
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9852846 and dc5aaeb.

📒 Files selected for processing (8)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (2 hunks)
  • docs/backlog.md (1 hunks)
  • pyproject.toml (1 hunks)
  • src/cratedb_about/query/core.py (5 hunks)
  • src/cratedb_about/query/model.py (2 hunks)
  • tests/test_query.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • pyproject.toml
  • CHANGES.md
  • .gitignore
  • src/cratedb_about/query/core.py
  • README.md
  • docs/backlog.md
🧰 Additional context used
🪛 GitHub Check: codecov/patch
src/cratedb_about/query/model.py

[warning] 73-73: src/cratedb_about/query/model.py#L73
Added line #L73 was not covered by tests

🔇 Additional comments (5)
src/cratedb_about/query/model.py (3)

57-59: LGTM: Appropriate default cache TTL

Setting a default cache TTL of 3600 seconds (1 hour) is a good balance between data freshness and resource efficiency, aligning with the PR objectives.


123-129: Good error handling with fallback content

The implementation properly handles exceptions when fetching context by logging the error and falling back to a minimal context rather than crashing or returning empty content. This improves robustness.


60-65:

Details

✅ Verification successful

Verify Hishel client configuration is optimal

The HTTP client is configured with allow_stale=True, which allows serving stale responses when the origin is unreachable. This is a good choice for resilience, but ensure you're aware of the freshness vs. availability tradeoff.


🏁 Script executed:

#!/bin/bash
# Look for any other configurations or uses of hishel in the codebase
rg -A 3 -B 3 "hishel\.(Controller|CacheClient|SQLiteStorage)" --type py

Length of output: 696


Hishel Configuration Check Completed

A repository-wide search shows this is the only instantiation of hishel.Controller, SQLiteStorage, and CacheClient. No other configurations need to be updated.

• Ensure that allow_stale=True aligns with your freshness vs. availability requirements.
• All other client settings are centralized here in src/cratedb_about/query/model.py.

tests/test_query.py (2)

12-18: Good fixture implementation for reusable loader instance

Creating a fixture for the loader instance improves test readability and reduces duplication. This is a good practice for maintainable tests.


126-140: Good test for invalid source scenario

This test properly verifies the fallback behavior when an invalid source is provided. It checks both the logging output and the fallback content, ensuring robust error handling.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/test_query.py (2)

116-133: HTTP payload test still skipped with "Why?" comment

The test is currently skipped with a reason that includes "Why?" which seems to be an unresolved question. Based on the past review comments, there are acknowledged challenges with properly mocking HTTP clients with caching using hishel.MockTransport.

While skipping problematic tests temporarily is acceptable, consider:

  1. Removing the questioning "Why?" from the skip reason to indicate it's a known limitation
  2. Adding a TODO comment with more details about the specific issue
-@pytest.mark.skip(reason="Does not work after introducing Hishel yet. Why?")
+@pytest.mark.skip(reason="Test incompatible with Hishel caching implementation")
+# TODO: Fix HTTP mocking with Hishel. Current implementation fails during test execution.
 def test_model_payload_from_http(monkeypatch):

33-40: Consider adding test for the configured ABOUT_CACHE_TTL value

There are tests for invalid TTL values, but no test verifying that a valid ABOUT_CACHE_TTL value is correctly configured. Consider adding a test to verify that the cache_ttl property reflects the environment variable value when set.

def test_loader_valid_ttl(mocker):
    # Set a valid cache TTL value
    ttl_value = "7200"  # 2 hours in seconds
    mocker.patch.dict("os.environ", {"ABOUT_CACHE_TTL": ttl_value})
    
    # Instantiate loader and verify TTL
    loader = CrateDbKnowledgeContextLoader()
    assert loader.cache_ttl == 7200
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc5aaeb and 5a34fbd.

📒 Files selected for processing (8)
  • .gitignore (1 hunks)
  • CHANGES.md (1 hunks)
  • README.md (2 hunks)
  • docs/backlog.md (1 hunks)
  • pyproject.toml (1 hunks)
  • src/cratedb_about/query/core.py (5 hunks)
  • src/cratedb_about/query/model.py (2 hunks)
  • tests/test_query.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
  • pyproject.toml
  • CHANGES.md
  • .gitignore
  • README.md
  • docs/backlog.md
  • src/cratedb_about/query/core.py
  • src/cratedb_about/query/model.py
🔇 Additional comments (4)
tests/test_query.py (4)

152-164: Good fix for the previous exception handling issue

The implementation correctly addresses the previous review comment by using a function that raises an exception rather than returning an Exception object. This ensures that the error handling logic in get_prompt() is properly tested.


167-184: Good validation of ABOUT_CACHE_TTL environment variable

These tests properly validate the error handling for invalid cache TTL values, covering both non-integer strings and negative integers. This ensures the cache TTL validation logic works correctly, which is a key part of the new caching feature.


12-17: Good practice: Using fixture for loader instance

The introduction of a pytest fixture for the loader instance is a good practice. It reduces code duplication and makes tests more maintainable.


136-149: Good fallback behavior testing

The test for invalid source paths effectively verifies that:

  1. The failure is properly logged
  2. The system continues to function by using fallback content
  3. The error doesn't propagate to the caller

This is a robust approach to error handling that maintains system resilience.

@amotl amotl force-pushed the query-cached-context branch from 5a34fbd to f937b4c Compare May 15, 2025 17:56
@amotl amotl requested review from bmunkholm and kneth May 15, 2025 17:56
@amotl amotl marked this pull request as ready for review May 15, 2025 17:56
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/test_query.py (1)

116-134: HTTP payload test skipped – consider a more detailed TODO comment

While skipping the test is reasonable given the known challenges with Hishel mocking (as noted in the past review comments), consider enhancing the TODO comment with more specific details about the failure mode or future resolution approach.

-# TODO: Fix HTTP mocking with Hishel. Current implementation fails during test execution.
+# TODO: Fix HTTP mocking with Hishel. Current implementation fails during test execution.
+# The test fails because [specific error/behavior]. Potential approaches to investigate:
+# 1. [approach 1]
+# 2. [approach 2]
🧹 Nitpick comments (1)
tests/test_query.py (1)

43-47: Minor typo in docstring

There's a small typo in the docstring.

-    Validate the prompt contex payload.
+    Validate the prompt context payload.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a34fbd and f937b4c.

📒 Files selected for processing (2)
  • src/cratedb_about/query/model.py (2 hunks)
  • tests/test_query.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/cratedb_about/query/model.py
🔇 Additional comments (7)
tests/test_query.py (7)

168-176: LGTM: Test correctly validates cache TTL configuration

The test properly checks that the loader respects the ABOUT_CACHE_TTL environment variable for a valid TTL value.


178-186: LGTM: Proper validation for non-integer TTL values

The test correctly verifies that the loader rejects non-integer TTL values with an appropriate error message.


188-195: LGTM: Proper validation for negative TTL values

The test correctly verifies that the loader rejects negative TTL values with an appropriate error message.


153-166: LGTM: Fixed exception handling test

The implementation correctly addresses the previous issue mentioned in past review comments where the test returned an Exception object instead of raising it.


12-17: LGTM: Well-designed fixture for loader instance

The fixture provides a clean way to access the CrateDbKnowledgeContextLoader instance throughout the tests, following best practices for pytest fixtures.


33-40: LGTM: Good error handling test for empty URL

The test correctly verifies the loader's behavior when faced with an empty URL environment variable, validating both the exception and its message.


137-151: LGTM: Well-structured test for invalid source handling

The test effectively validates the fallback behavior when an invalid context source is provided, including both log message verification and fallback content checking.

@amotl amotl force-pushed the query-cached-context branch from f937b4c to 5a341f8 Compare May 15, 2025 18:01
@amotl amotl force-pushed the load-context-fs branch from 1dc9199 to 8693830 Compare May 15, 2025 19:33
Base automatically changed from load-context-fs to main May 15, 2025 19:34
amotl added 2 commits May 15, 2025 21:35
Remote resources will be cached for one hour (3600 seconds) to strike
the balance between freshness and resource saving.

Use the `ABOUT_CACHE_TTL` environment variable to reconfigure that
value in seconds.
@amotl amotl force-pushed the query-cached-context branch from 5a341f8 to 919362e Compare May 15, 2025 19:35
@amotl amotl merged commit 90dbe42 into main May 15, 2025
6 checks passed
@amotl amotl deleted the query-cached-context branch May 15, 2025 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant