fix(clp-mcp-server): Include log viewer links in formatted log events returned from KQL queries.#1491
Conversation
Walkthroughformat_query_results now requires each log event to include a Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Formatter as format_query_results
Caller->>Formatter: format_query_results(query_results)
loop per event
Formatter->>Formatter: parse timestamp, message, other kv-pairs
Formatter->>Formatter: read obj["link"] (required)
alt link present
Formatter-->>Caller: formatted string with ", link: <link>"
else link missing
Formatter-->>Caller: raises KeyError (no fallback)
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/clp-mcp-server/tests/server/test_utils.py (1)
210-215: Consider adding test coverage for warning behaviour.While the functional output is tested correctly, there's no explicit verification that warnings are logged when the link field is missing. Consider adding a test with
caplogfixture to verify the warning behaviour, especially since 3 out of 4 entries in the test data lack links.Example:
def test_missing_link_warning(self, caplog): """Validates that missing link fields generate warnings.""" entries_without_links = [entry for entry in self.RAW_LOG_ENTRIES if "link" not in entry] with caplog.at_level(logging.WARNING): format_query_results(entries_without_links) # Verify warnings were logged for each entry without a link assert len([record for record in caplog.records if "Empty link" in record.message]) == len(entries_without_links)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/clp-mcp-server/clp_mcp_server/server/utils.py(2 hunks)components/clp-mcp-server/tests/server/test_utils.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: package-image
🔇 Additional comments (2)
components/clp-mcp-server/tests/server/test_utils.py (2)
116-119: LGTM!The test data correctly includes a link field with a properly formatted URL. The link parameters (streamId and logEventIdx) appropriately match the corresponding archive_id and log_event_ix fields in the same entry.
127-129: Expected result updated correctly.The expected output format matches the implementation and correctly includes the link field.
| - "link": (Optional) A string representing the link to open the log viewer displaying the | ||
| message. | ||
|
|
||
| The message will be formatted as `timestamp: <date string>, message: <message>`: | ||
| The message will be formatted as `timestamp: <date string>, message: <message>, link: <link>`. |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider documenting the warning behaviour for missing links.
The docstring indicates that the link field is optional, but doesn't mention that a warning will be logged when it's missing (see lines 111-112). For completeness, consider adding a note about this behaviour, especially since it might be unexpected for callers who view the field as truly optional.
🤖 Prompt for AI Agents
In components/clp-mcp-server/clp_mcp_server/server/utils.py around lines 86 to
89, the docstring describes "link" as optional but omits that a warning is
emitted when it is missing (see code at ~lines 111-112); update the docstring to
state that while the link field may be None, the function will log a warning if
the link is not provided so callers are aware of the runtime behavior and can
avoid or handle the warning.
| 'http://localhost:4000/streamFile?dataset=default&' | ||
| 'type=json&streamId=abc125&logEventIdx=102' |
There was a problem hiding this comment.
| 'http://localhost:4000/streamFile?dataset=default&' | |
| 'type=json&streamId=abc125&logEventIdx=102' | |
| "http://localhost:4000/streamFile?dataset=default&" | |
| "type=json&streamId=abc125&logEventIdx=102" |
Technically the line wrapping here is also wrong...
There was a problem hiding this comment.
the line wrapping here is also wrong
do you mean?
| 'http://localhost:4000/streamFile?dataset=default&' | |
| 'type=json&streamId=abc125&logEventIdx=102' | |
| "http://localhost:4000/streamFile" | |
| "?dataset=default" | |
| '&type=json" | |
| "&streamId=abc125" | |
| "&logEventIdx=102" |
i agree
There was a problem hiding this comment.
I thought single quotation and double quotation were the same, its just that single quotation is used for escaption the actual double quotation in the string?
I see, I will make a note on the line wrap
| continue | ||
|
|
||
| formatted_log_events.append(f"timestamp: {timestamp_str}, message: {message}") | ||
| link = obj.get("link", "") |
There was a problem hiding this comment.
Just a sanity check: shouldn't this link always exist, assuming the rusult is from clp-s? @junhaoliao Can u confirm?
There was a problem hiding this comment.
In fact, the webui link is derived using the following logic in clp_connector:
async for doc in collection.find({}, limit=SEARCH_MAX_NUM_RESULTS):
doc["link"] = (
f"{self._webui_addr}/streamFile?type=json"
f'&streamId={doc["archive_id"]}'
f"&dataset={CLP_DEFAULT_DATASET_NAME}"
f'&logEventIdx={doc["log_event_ix"]}'
)
doc["_id"] = None
results.append(doc)
where collection is the result read from MongoDB and results is the response we will return back to AI after formatting.
To answer the question, this link should always exist in the current implementation. But shouldn't we still preserve this if checking (whether the link is there or not) in case we change the implementation of derving the link in the future?
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
components/clp-mcp-server/clp_mcp_server/server/utils.py (1)
110-110: Consider adding type validation for the link field.The code doesn't validate that
linkis actually a string. If the field is present but has an unexpected type (e.g.,None,int,list), it will be coerced to a string in the f-string on line 116, potentially producing unexpected output like"link: None"or"link: [1, 2, 3]".Add a type check after extracting the link:
link = obj.get("link", "") +if link and not isinstance(link, str): + logger.warning("Non-string link field in log event: %s.", obj) + link = "" if not link: logger.warning("No valid link provided in log event: %s.", obj)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/clp-mcp-server/clp_mcp_server/server/utils.py(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: package-image
🔇 Additional comments (2)
components/clp-mcp-server/clp_mcp_server/server/utils.py (2)
86-87: LGTM!The docstring clearly documents the optional "link" field and its purpose.
89-89: LGTM!The formatted output example clearly shows the optional link field using appropriate notation.
| link = obj.get("link", "") | ||
| if not link: | ||
| logger.warning("No valid link provided in log event: %s.", obj) | ||
| formatted_log_events.append(f"timestamp: {timestamp_str}, message: {message}") | ||
| continue | ||
|
|
||
| formatted_log_events.append(f"timestamp: {timestamp_str}, message: {message}, link: {link}") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Eliminate string duplication by conditionally building the formatted entry.
The formatting string is duplicated on lines 113 and 116, differing only in the link portion. This violates DRY and makes the code harder to maintain—future format changes must be applied in two places.
Refactor to build the formatted string conditionally:
link = obj.get("link", "")
if not link:
logger.warning("No valid link provided in log event: %s.", obj)
- formatted_log_events.append(f"timestamp: {timestamp_str}, message: {message}")
- continue
-formatted_log_events.append(f"timestamp: {timestamp_str}, message: {message}, link: {link}")
+formatted_entry = f"timestamp: {timestamp_str}, message: {message}"
+if link:
+ formatted_entry += f", link: {link}"
+formatted_log_events.append(formatted_entry)🤖 Prompt for AI Agents
In components/clp-mcp-server/clp_mcp_server/server/utils.py around lines
110-116, the formatted log string is duplicated; instead, build a single base
formatted string (timestamp and message), if link is present append the ", link:
{link}" fragment, then append the result to formatted_log_events; keep the
existing logger.warning when link is missing but remove the duplicated
formatted_log_events.append call so the final append happens once.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/clp-mcp-server/tests/server/test_utils.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
| "link": ( | ||
| "http://localhost:4000/streamFile" | ||
| "?dataset=default" | ||
| '&type=json' | ||
| "&streamId=abc125" | ||
| "&logEventIdx=102" | ||
| ), |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use consistent quote style.
Line 119 uses single quotes ('&type=json') while the surrounding lines use double quotes. For consistency and adherence to Python style conventions, use double quotes throughout the string concatenation.
Apply this diff to fix the quote consistency:
"link": (
"http://localhost:4000/streamFile"
"?dataset=default"
- '&type=json'
+ "&type=json"
"&streamId=abc125"
"&logEventIdx=102"
),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "link": ( | |
| "http://localhost:4000/streamFile" | |
| "?dataset=default" | |
| '&type=json' | |
| "&streamId=abc125" | |
| "&logEventIdx=102" | |
| ), | |
| "link": ( | |
| "http://localhost:4000/streamFile" | |
| "?dataset=default" | |
| "&type=json" | |
| "&streamId=abc125" | |
| "&logEventIdx=102" | |
| ), |
🤖 Prompt for AI Agents
components/clp-mcp-server/tests/server/test_utils.py around lines 116 to 122:
the URL string concatenation uses a single-quoted segment ('&type=json') while
all other parts use double quotes; replace the single-quoted '&type=json' with a
double-quoted "&type=json" so the entire concatenation uses consistent double
quotes.
| '"message":"Log with millisecond precision"}\n, ' | ||
| "link: http://localhost:4000/streamFile" | ||
| "?dataset=default" | ||
| '&type=json' | ||
| "&streamId=abc125" | ||
| "&logEventIdx=102" | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use consistent quote style and remove trailing whitespace.
Line 133 uses single quotes ('&type=json') while the surrounding lines use double quotes. Additionally, line 136 appears to contain trailing whitespace.
Apply this diff to fix both issues:
'timestamp: 2024-10-18T16:00:00.123Z, message: '
'{"ts":1729267200123,"pid":1234,"tid":5678,'
'"message":"Log with millisecond precision"}\n, '
"link: http://localhost:4000/streamFile"
"?dataset=default"
- '&type=json'
+ "&type=json"
"&streamId=abc125"
"&logEventIdx=102"
-
),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| '"message":"Log with millisecond precision"}\n, ' | |
| "link: http://localhost:4000/streamFile" | |
| "?dataset=default" | |
| '&type=json' | |
| "&streamId=abc125" | |
| "&logEventIdx=102" | |
| '"message":"Log with millisecond precision"}\n, ' | |
| "link: http://localhost:4000/streamFile" | |
| "?dataset=default" | |
| "&type=json" | |
| "&streamId=abc125" | |
| "&logEventIdx=102" | |
| ), |
🤖 Prompt for AI Agents
In components/clp-mcp-server/tests/server/test_utils.py around lines 130 to 136,
change the inconsistent single-quoted string on line 133 to use double quotes to
match the surrounding lines and remove any trailing whitespace on line 136 (and
at line ends in the block) so all concatenated string pieces consistently use
double quotes and no lines end with extra spaces.
LinZhihao-723
left a comment
There was a problem hiding this comment.
Let me know when the e2e testing is done.
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
LinZhihao-723
left a comment
There was a problem hiding this comment.
Directly modified PR title. Let me know if it's ok.
… returned from KQL queries. (y-scope#1491) Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Description
This PR adds the missing link field in the search result returned by the kql search query. The field is used by LLM to create hyperlinks when displaying the log messages. The current implementation filters this field away when formatting the query result.
Checklist
breaking change.
Validation performed
uv run --group dev pytest tests/test_utils.py -vvtask docker-images:package. Connected Claude Desktop agent to the running MCP server service in docker compose. Manually instructed LLM to perform a kql that returns a log event, the agent can successfully render the log viewer's link:And this is the interface when hitting the link - "View full log entry", which clearly shows the returned log event:
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation