Skip to content

ci: adding errors to Github summary#14778

Merged
merrymercy merged 1 commit intomainfrom
adding-errors-to-git-summary
Dec 15, 2025
Merged

ci: adding errors to Github summary#14778
merrymercy merged 1 commit intomainfrom
adding-errors-to-git-summary

Conversation

@dougyster
Copy link
Copy Markdown
Collaborator

@dougyster dougyster commented Dec 10, 2025

Motivation

Adding errors to github summary table for nightly accuracy tests.

Modifications

Adding error column and error capture in the try catch block in test_mmmu_vlm_models and test_mgsm_en_all_models.

Accuracy Tests

N/A

Benchmarking and Profiling

N/A

Checklist

@dougyster dougyster changed the title ci: adding errors to Github summary CI: adding errors to Github summary Dec 10, 2025
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @dougyster, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the debugging experience for Continuous Integration (CI) test failures. It modifies the test execution utility to capture and process standard output and error streams from test runs. By intelligently extracting and formatting critical error information, the changes ensure that detailed failure summaries are automatically posted to the GitHub Actions step summary, allowing developers to quickly identify and address issues without needing to dig through extensive logs.

Highlights

  • Enhanced Error Reporting: Implemented a new mechanism to capture stderr and stdout from test processes, providing more detailed failure information.
  • Structured Failure Data: Introduced a TestFailureInfo dataclass to store comprehensive details about each test failure, including filename, reason, and captured output.
  • Intelligent Error Extraction: Added an extract_error_lines utility function that intelligently parses stderr and stdout to identify and highlight key error messages and tracebacks.
  • GitHub Actions Integration: Integrated the detailed test failure information directly into the GitHub Actions step summary, making debugging easier and faster by presenting errors prominently in the PR checks.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature for CI by adding a summary of test failures to the GitHub step summary, which will greatly aid in debugging. The implementation is mostly solid, capturing stdout/stderr and parsing them for errors. I've identified a potential bug in the error extraction logic that could cause some errors to be missed. I've also included a few suggestions to improve performance and maintainability.

Comment thread python/sglang/test/ci/ci_utils.py Outdated
for j in range(start, min(i + 3, len(lines))):
if len(error_lines) < max_lines:
error_lines.append(lines[j])
i += 3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Advancing the loop counter by 3 (i += 3) will cause the loop to skip checking lines i+1 and i+2. If there are errors on these consecutive lines, they will be missed. To fix this, you should increment i by 1. This may introduce duplicate lines from overlapping contexts, which should be handled separately (see my other comment).

Suggested change
i += 3
i += 1

Comment thread python/sglang/test/ci/ci_utils.py Outdated
# Show key error lines
summary_lines.append("**Key Error Lines:**\n")
summary_lines.append("```python\n")
for line in failure.error_lines[:30]: # Limit to first 30 lines
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The magic number 30 is used here and on line 65 to limit the number of error lines in the summary. It's good practice to define this as a constant (e.g., MAX_SUMMARY_ERROR_LINES = 30) at the top of the function or module. This improves readability and makes it easier to change the value in one place.

Comment thread python/sglang/test/ci/ci_utils.py Outdated
Comment on lines +107 to +123
error_patterns = [
r"Traceback \(most recent call last\):",
r"\w*Error:", # Catches AssertionError, RuntimeError, ValueError, etc.
r"\w*Exception:",
r"FAILED",
r"ERROR",
r"Failed to",
r"raise \w+Error",
]

# Find lines matching error patterns and collect context
i = 0
while i < len(lines) and len(error_lines) < max_lines:
line = lines[i]

# Check if this line matches an error pattern
if any(re.search(pattern, line, re.IGNORECASE) for pattern in error_patterns):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For performance, it's better to compile the regex patterns once at the start of the function, rather than recompiling them for every line in the log inside the any() expression. This avoids significant overhead, especially for large logs. The compiled patterns can then be used directly.

Suggested change
error_patterns = [
r"Traceback \(most recent call last\):",
r"\w*Error:", # Catches AssertionError, RuntimeError, ValueError, etc.
r"\w*Exception:",
r"FAILED",
r"ERROR",
r"Failed to",
r"raise \w+Error",
]
# Find lines matching error patterns and collect context
i = 0
while i < len(lines) and len(error_lines) < max_lines:
line = lines[i]
# Check if this line matches an error pattern
if any(re.search(pattern, line, re.IGNORECASE) for pattern in error_patterns):
# Pattern to identify error-related lines
error_patterns = [
re.compile(p, re.IGNORECASE) for p in [
r"Traceback \(most recent call last\):",
r"\w*Error:", # Catches AssertionError, RuntimeError, ValueError, etc.
r"\w*Exception:",
r"FAILED",
r"ERROR",
r"Failed to",
r"raise \w+Error",
]
]
# Find lines matching error patterns and collect context
i = 0
while i < len(lines) and len(error_lines) < max_lines:
line = lines[i]
# Check if this line matches an error pattern
if any(pattern.search(line) for pattern in error_patterns):

Comment thread python/sglang/test/ci/ci_utils.py Outdated
else:
i += 1

return error_lines
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To handle duplicate lines that may arise from overlapping error contexts (especially after fixing the loop increment from i += 3 to i += 1), you can deduplicate the error_lines list while preserving order before returning. A concise way to do this in Python 3.7+ is list(dict.fromkeys(error_lines)).

Suggested change
return error_lines
return list(dict.fromkeys(error_lines))

@dougyster
Copy link
Copy Markdown
Collaborator Author

@dougyster dougyster force-pushed the adding-errors-to-git-summary branch 2 times, most recently from 42d6e55 to dce337e Compare December 10, 2025 22:04
@github-actions github-actions Bot added the Multi-modal multi-modal language model label Dec 10, 2025
@dougyster dougyster force-pushed the adding-errors-to-git-summary branch from dce337e to 9316a1a Compare December 11, 2025 01:12
@dougyster dougyster changed the title CI: adding errors to Github summary ci: adding errors to Github summary Dec 12, 2025
@merrymercy
Copy link
Copy Markdown
Contributor

/tag-and-rerun-ci

@merrymercy merrymercy merged commit 9e9a616 into main Dec 15, 2025
119 of 136 checks passed
@merrymercy merrymercy deleted the adding-errors-to-git-summary branch December 15, 2025 05:08
Liwansi added a commit to iforgetmyname/sglang that referenced this pull request Dec 15, 2025
…n_eagle3_npu

* 'main' of https://github.com/sgl-project/sglang: (89 commits)
  [model-gateway] Remove legacy RouterMetrics and Rename SmgMetrics to Metrics and smg_labels to metrics_labels (sgl-project#15160)
  [diffusion] fix: fix video model sp when resolution is not specified (sgl-project#15047)
  [diffusion] fix: fix pytorch non-writable array warning (sgl-project#15017)
  [diffusion] fix: cache dit with parallel (sgl-project#15163)
  chore: change npu pr-test a2 runner (sgl-project#15152)
  [Feature] Fuse mrope all in 1 kernel (sgl-project#14906)
  Fix num running requests (load) wrong cleared for ongoing requests (sgl-project#15116)
  Fused two elementwise kernels for k_nope and k_pe concat (sgl-project#14862)
  fix: adding date and fixing release name issue (sgl-project#15174)
  [CPU] Add Gemma3RMSNorm kernel in sgl-kernel and add ut (sgl-project#9324)
  feature: PR wheel (sgl-project#15170)
  [diffusion] model: support mutli-image input and qwen-image-edit-2509 (sgl-project#15005)
  fix CompressedTensorsW8A8Int8 min_capability (sgl-project#13914)
  Tiny improve summary text in `bench_one_batch_server.py` (sgl-project#15158)
  [model-gateway] add mcp and discovery metrics (sgl-project#15156)
  fix: move ci-bot (sgl-project#15154)
  Fix import warnings (sgl-project#15144)
  ci: adding errors to Github summary (sgl-project#14778)
  [model-gateway] Add streaming metrics for harmony gRPC router (sgl-project#15147)
  [model-gateway] upgrade axum and axum server (sgl-project#15146)
  ...

# Conflicts:
#	python/sglang/srt/server_args.py
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 17, 2025
YChange01 pushed a commit to YChange01/sglang that referenced this pull request Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Multi-modal multi-modal language model run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants