chore(package)!: Set results_cache.retention_period to 60min so that old search results aren't retained indefinitely by default.#1231
Conversation
WalkthroughDefault for ResultsCache.retention_period changed from None to 60 in Python config. Corresponding commented default in the package template YAML updated from null to 60. Validation logic and other control flow remain unchanged. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
components/clp-py-utils/clp_py_utils/clp_config.py(1 hunks)components/package-template/src/etc/clp-config.yml(1 hunks)
🔇 Additional comments (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
365-365: LGTM: sensible default and validator are consistentDefaulting ResultsCache.retention_period to 60 minutes while allowing None is reasonable, and the validator (> 0 when not None) is correct.
| db_name: str = "clp-query-results" | ||
| stream_collection_name: str = "stream-files" | ||
| retention_period: Optional[int] = None | ||
| retention_period: Optional[int] = 60 |
There was a problem hiding this comment.
💡 Verification agent
❓ Verification inconclusive
Behaviour change: enable deletion by default; document + add config guard and avoid magic number
Changing the default from None to 60 means deployments that previously omitted this setting will now have search results deleted after 60 minutes. Recommend:
- Call out this breaking default change in release notes and package docs.
- Add a cross-field validation ensuring retention_period (if set) is not less than garbage_collector.sweep_interval.search_result.
- Replace the magic number 60 with a named constant to reduce drift with the template YAML.
Diff within this hunk to use a constant:
- retention_period: Optional[int] = 60
+ retention_period: Optional[int] = DEFAULT_SEARCH_RESULT_RETENTION_MINUTESAdd this constant near the other “Constants” in this module:
# Default retention settings
DEFAULT_SEARCH_RESULT_RETENTION_MINUTES = 60Optionally, enforce the relationship at the aggregate config level:
# In class CLPConfig
from pydantic import root_validator
class CLPConfig(BaseModel):
# ... existing fields ...
@root_validator
def validate_retention_vs_sweep(cls, values):
rc = values.get("results_cache")
gc = values.get("garbage_collector")
if rc and gc:
rp = rc.retention_period
sweep = gc.sweep_interval.search_result
if rp is not None and rp < sweep:
raise ValueError(
f"results_cache.retention_period ({rp} min) must be >= "
f"garbage_collector.sweep_interval.search_result ({sweep} min)."
)
return valuesVerification suggestion:
- Ensure release notes explicitly mention the default enablement of result retention and the 60-minute default.
- If you want, I can add a unit test asserting the default retention and the cross-field validation behaviour.
Use a named constant for retention_period and add cross-field validation
• components/clp-py-utils/clp_py_utils/clp_config.py:365
Change
retention_period: Optional[int] = 60
to
retention_period: Optional[int] = DEFAULT_SEARCH_RESULT_RETENTION_MINUTES
• In the same module (near the other constants), add:
# Default retention settings
DEFAULT_SEARCH_RESULT_RETENTION_MINUTES = 60• In the CLPConfig model, enforce that results_cache.retention_period (if set) is ≥ garbage_collector.sweep_interval.search_result:
from pydantic import root_validator
class CLPConfig(BaseModel):
# … existing fields …
@root_validator
def validate_retention_vs_sweep(cls, values):
rc = values.get("results_cache")
gc = values.get("garbage_collector")
if rc and gc:
rp = rc.retention_period
sweep = gc.sweep_interval.search_result
if rp is not None and rp < sweep:
raise ValueError(
f"results_cache.retention_period ({rp} min) must be ≥ "
f"garbage_collector.sweep_interval.search_result ({sweep} min)"
)
return values• Document the breaking change in your release notes and package docs: deployments that omit this setting will now delete search results after 60 minutes by default.
(Optional) Add unit tests to assert the default value and validation logic.
🤖 Prompt for AI Agents
In components/clp-py-utils/clp_py_utils/clp_config.py around line 365, replace
the literal default retention_period value with a named constant and add
cross-field validation: add a new constant
DEFAULT_SEARCH_RESULT_RETENTION_MINUTES = 60 near the other module constants,
change retention_period: Optional[int] = 60 to retention_period: Optional[int] =
DEFAULT_SEARCH_RESULT_RETENTION_MINUTES, and add a pydantic root_validator on
CLPConfig that checks if results_cache.retention_period (when not None) is >=
garbage_collector.sweep_interval.search_result and raises a ValueError with a
clear message if not; also note this breaking change in release notes/docs
(default retention now 60 minutes) and consider adding unit tests for the
default and validation logic.
| # | ||
| # # Retention period for search results, in minutes. Set to null to disable automatic deletion. | ||
| # retention_period: null | ||
| # retention_period: 60 |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Commented default moved to 60 — consider adding an inline hint about sweep interval
Runtime behaviour isn’t affected since this is commented, but adding a short hint helps users keep it consistent with the GC sweep interval.
Apply this diff to make the guidance explicit:
-# retention_period: 60
+# retention_period: 60 # minutes; should be >= garbage_collector.sweep_interval.search_result🤖 Prompt for AI Agents
In components/package-template/src/etc/clp-config.yml around line 57, the
commented default "retention_period: 60" lacks an inline hint about matching the
GC sweep interval; update the commented line to include a short inline note
(e.g., "# retention_period: 60 # should match GC sweep interval; adjust if your
sweeper runs at different interval") so users know to keep retention_period
consistent with the garbage-collection sweep interval.
There was a problem hiding this comment.
For the PR title, how about:
chore(package)!: Set `results_cache.retention_period` to 60min so that old search results aren't retained indefinitely by default.
I marked this as breaking since it is a change to the existing behaviour since the last release.
results_cache.retention_period to 60min so that old search results aren't retained indefinitely by default.
results_cache.retention_period to 60min so that old search results aren't retained indefinitely by default.results_cache.retention_period to 60min so that old search results aren't retained indefinitely by default.
… old search results aren't retained indefinitely by default. (y-scope#1231)
Description
As discussed offline with @kirkrodrigues, it would make more sense to enable retention control for search results by default, since user only have the access to the search results generated by the latest query job.
We choose 60 minutes to be the retention period since
Checklist
breaking change.
Validation performed
Manually verified that garbage collector starts with the package and the following log is printed:
Summary by CodeRabbit
New Features
Documentation