fix(clp-package): Connect directly to the MongoDB replica (results cache) to avoid failed host resolution of in-container hostname (fixes #1519).#1518
Conversation
…reading query aggregation results from the result cache.
WalkthroughThe MongoDB client instantiation in the search script is modified to enable direct connection mode by passing Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/clp-package-utils/clp_package_utils/scripts/native/search.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
components/clp-package-utils/clp_package_utils/scripts/native/search.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
get_uri(336-337)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: package-image
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
| if do_count_aggregation is None and count_by_time_bucket_size is None: | ||
| return | ||
| with pymongo.MongoClient(results_cache.get_uri()) as client: | ||
| with pymongo.MongoClient(results_cache.get_uri(), directConnection=True) as client: |
There was a problem hiding this comment.
🧩 Analysis chain
LGTM! Correct parameter for single-node replica set.
The addition of directConnection=True is the appropriate solution for connecting to a single-node MongoDB replica set with an in-container hostname. This parameter instructs the MongoDB driver to bypass automatic topology discovery and connect directly to the specified host, preventing hostname resolution issues during client initialization.
To ensure consistency across the codebase, verify that all other MongoClient instantiations for the results cache also use directConnection=True:
🏁 Script executed:
#!/bin/bash
# Description: Find all MongoClient instantiations that use results_cache to ensure consistent directConnection usage
# Search for MongoClient calls with results_cache
rg -nP --type=py -C3 'MongoClient\([^)]*results_cache'Length of output: 3029
Apply directConnection=True consistently across all results_cache MongoClient instantiations.
The change at line 80 is correct, but verification reveals two other MongoClient instantiations for results_cache that are missing directConnection=True:
components/clp-py-utils/clp_py_utils/initialize-results-cache.py:133components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:877
These locations connect to the same single-node replica set and should use the same directConnection=True parameter to avoid hostname resolution issues during initialization.
🤖 Prompt for AI Agents
In components/clp-package-utils/clp_package_utils/scripts/native/search.py
around line 80 and in
components/clp-py-utils/clp_py_utils/initialize-results-cache.py around line 133
and
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py
around line 877, the MongoClient instantiations that use results_cache.get_uri()
must include directConnection=True; update those calls to pass
directConnection=True to pymongo.MongoClient (preserving any existing kwargs) so
all connections to the single-node replica set use the same directConnection
behavior and avoid hostname resolution issues.
|
@coderabbitai, can you open an issue about the problem that this PR fixes? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain@kirkrodrigues, I'll create an issue documenting the problem that this PR fixes. 🏁 Script executed: Length of output: 42 @kirkrodrigues, I've created issue #1519 to document the MongoDB connection problem that this PR fixes. The issue describes the root cause (MongoDB server acting as a proxy and attempting to resolve the in-container hostname without You can view it here: #1519 |
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
fix(clp-package): Connect directly to the MongoDB replica (results cache) to avoid failed host resolution of in-container hostname (fixes #1519).
directConnection=True to MongoClient() for reading query aggregation results from the result cache.…solve dev-env startup issue. (#1598)
…che) to avoid failed host resolution of in-container hostname (fixes y-scope#1519). (y-scope#1518)
…to resolve dev-env startup issue. (y-scope#1598)
Description
This change updates the results-cache MongoDB client initialization in
components/clp-package-utils/clp_package_utils/scripts/native/search.pyto passdirectConnection=Truewhen reading aggregation outputs. This is needed because the results cache is provisioned as a single-node replica with the in-container hostname; withoutdirectConnection=True, the Mongo server would serve as a proxy and try to resolve the given host in the initialization before making connections.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Note: This release contains primarily internal infrastructure improvements with no direct user-facing changes.