Add a performance dashboard server and frontend for nightly CUDA tests by Kangyan-Zhou · Pull Request #17725 · sgl-project/sglang

Kangyan-Zhou · 2026-01-25T23:57:38Z

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.

Review Process

Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
Get approvals from CODEOWNERS and other reviewers.
Trigger CI tests with comments or contact authorized users to do so.
- /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
After green CI and required approvals, ask Merge Oncalls to merge.

gemini-code-assist · 2026-01-25T23:57:55Z

Summary of Changes

Hello @Kangyan-Zhou, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes a dedicated performance dashboard for SGLang's nightly CUDA tests. Its primary goal is to offer a clear, interactive platform for monitoring and analyzing key performance indicators like throughput, latency, and TTFT over time. This will enable developers to quickly identify performance regressions, track improvements, and make data-driven decisions regarding model and configuration optimizations.

Highlights

New Performance Dashboard: Introduces a web-based dashboard for visualizing SGLang nightly CUDA test performance metrics, providing a dedicated tool for performance monitoring.
Automated Data Fetching: Includes Python scripts (fetch_metrics.py, server.py) to automatically fetch workflow runs and consolidated performance artifacts from GitHub Actions, handling authentication and data extraction.
Interactive Frontend Visualization: Provides an interactive web interface (index.html, app.js) utilizing Chart.js to display performance trends for throughput, latency, and Time to First Token (TTFT) over time.
Filtering and Model Comparison: Enables users to filter performance data by GPU configuration, model, variant, and batch size, facilitating detailed analysis and comparison across different configurations.
Flexible Deployment Options: Outlines instructions for deploying the dashboard to GitHub Pages for public access or self-hosting using the provided Python development server, offering versatility in setup.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive performance dashboard for visualizing nightly CUDA test metrics, including a frontend, a Python server for serving data, and a script for fetching metrics from GitHub. The implementation is well-structured. My review focuses on improving the robustness and security of the new components. I've identified a potential cross-site scripting (XSS) vulnerability in the JavaScript frontend, a UI bug related to the filtering logic, and a missing timeout in the Python data fetching script that could affect its reliability. The proposed changes will enhance security and fix the identified bugs.

gemini-code-assist · 2026-01-25T23:59:27Z

+    statsRow.innerHTML = `
+        <div class="stat-card">
+            <div class="label">Total Runs</div>
+            <div class="value">${allMetricsData.length}</div>
+        </div>
+        <div class="stat-card">
+            <div class="label">Models Tested</div>
+            <div class="value">${totalModels}</div>
+        </div>
+        <div class="stat-card">
+            <div class="label">Benchmarks</div>
+            <div class="value">${totalBenchmarks}</div>
+        </div>
+        <div class="stat-card">
+            <div class="label">Peak Throughput</div>
+            <div class="value">${formatNumber(maxThroughput)}</div>
+            <div class="change">${maxThroughputModel}</div>
+        </div>
+    `;


The updateStats function uses innerHTML to render statistics, including maxThroughputModel which is derived from API data. This could introduce a cross-site scripting (XSS) vulnerability if a model name contains malicious HTML. To prevent this, it's safer to build the DOM elements programmatically and use textContent to set their content.

statsRow.innerHTML = ''; // Clear previous stats const addStat = (label, value, change) => { const card = document.createElement('div'); card.className = 'stat-card'; const labelEl = document.createElement('div'); labelEl.className = 'label'; labelEl.textContent = label; const valueEl = document.createElement('div'); valueEl.className = 'value'; valueEl.textContent = value; card.appendChild(labelEl); card.appendChild(valueEl); if (change) { const changeEl = document.createElement('div'); changeEl.className = 'change'; changeEl.textContent = change; card.appendChild(changeEl); } statsRow.appendChild(card); }; addStat('Total Runs', allMetricsData.length); addStat('Models Tested', totalModels); addStat('Benchmarks', totalBenchmarks); addStat('Peak Throughput', formatNumber(maxThroughput), maxThroughputModel);

gemini-code-assist · 2026-01-25T23:59:27Z

+    if event:
+        params["event"] = event
+
+    response = requests.get(url, headers=get_headers(token), params=params)


The call to requests.get is missing a timeout. This could cause the script to hang indefinitely if the GitHub API is slow or unresponsive. It's a good practice to always include a timeout for network requests for robustness.

Please add a timeout to this call and other requests.get calls in this file (e.g., lines 101, 116, 188). A timeout of 30 seconds is a reasonable starting point.

Suggested change

response = requests.get(url, headers=get_headers(token), params=params)

response = requests.get(url, headers=get_headers(token), params=params, timeout=30)

gemini-code-assist · 2026-01-25T23:59:27Z

+                </div>
+                <div class="filter-group">
+                    <label>Model</label>
+                    <select id="model-filter" onchange="updateCharts()">


The onchange handler for the model filter dropdown directly calls updateCharts(). This creates a bug: if a model tab is already active, changing the dropdown won't update the charts because the application logic prioritizes the tab selection. This also leads to the UI being out of sync (dropdown and tabs show different models).

To fix this, the onchange event should call a new handler function that synchronizes the selected model with the tabs.

First, update the onchange attribute here. Then, add the following helper function to app.js:

function handleModelFilterChange(model) { // Find the corresponding tab and activate it to keep UI in sync const tab = Array.from(document.querySelectorAll('.tab')).find(t => t.title === model || (model === 'all' && t.textContent === 'All Models')); if (tab) { selectModelTab(model, tab); } }

Suggested change

<select id="model-filter" onchange="updateCharts()">

<select id="model-filter" onchange="handleModelFilterChange(this.value)">

- Add benchmarks_by_io_len nested structure in metrics for grouping by input/output length combinations (with backward compatibility) - Add Input/Output Length dropdown filter to dashboard - Add Output Throughput metric tab - Add Accept Length metric tab (filters invalid -1/null values) - Add defensive checks for null variants, malformed IO keys, and DOM elements - Support numeric sorting for IO length options Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

gemini-code-assist · 2026-01-28T06:22:30Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

sgl-project#17725)

Add a performance dashboard server and frontend for nightly CUDA tests

0e9bbf1

github-actions Bot added the documentation Improvements or additions to documentation label Jan 25, 2026

gemini-code-assist Bot reviewed Jan 25, 2026

View reviewed changes

Kangyan-Zhou and others added 4 commits January 25, 2026 16:06

fix

0306b52

fix

f9c713d

update dashboard

66a7c15

Kangyan-Zhou marked this pull request as ready for review January 28, 2026 06:22

Kangyan-Zhou merged commit c0b4dd6 into sgl-project:main Jan 28, 2026
58 of 60 checks passed

charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Jan 30, 2026

Add a performance dashboard server and frontend for nightly CUDA tests (

27fa668

sgl-project#17725)

Chen-0210 pushed a commit to Chen-0210/sglang that referenced this pull request Jan 30, 2026

Add a performance dashboard server and frontend for nightly CUDA tests (

2eb358f

sgl-project#17725)

sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026

Add a performance dashboard server and frontend for nightly CUDA tests (

9c7cf51

sgl-project#17725)

Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026

Add a performance dashboard server and frontend for nightly CUDA tests (

3959cb8

sgl-project#17725)

hnyls2002 mentioned this pull request Mar 23, 2026

[CI Infrastructure] Roadmap: Regression-Based CI Checks #21157

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a performance dashboard server and frontend for nightly CUDA tests#17725

Add a performance dashboard server and frontend for nightly CUDA tests#17725
Kangyan-Zhou merged 5 commits intosgl-project:mainfrom
Kangyan-Zhou:performance_dashboard

Kangyan-Zhou commented Jan 25, 2026

Uh oh!

gemini-code-assist Bot commented Jan 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jan 25, 2026

Uh oh!

gemini-code-assist Bot Jan 25, 2026

Uh oh!

gemini-code-assist Bot Jan 25, 2026

Uh oh!

gemini-code-assist Bot commented Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	response = requests.get(url, headers=get_headers(token), params=params)
	response = requests.get(url, headers=get_headers(token), params=params, timeout=30)

	<select id="model-filter" onchange="updateCharts()">
	<select id="model-filter" onchange="handleModelFilterChange(this.value)">

Conversation

Kangyan-Zhou commented Jan 25, 2026

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

Uh oh!

gemini-code-assist Bot commented Jan 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jan 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot commented Jan 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant