Skip to content

Add a performance dashboard server and frontend for nightly CUDA tests#17725

Merged
Kangyan-Zhou merged 5 commits intosgl-project:mainfrom
Kangyan-Zhou:performance_dashboard
Jan 28, 2026
Merged

Add a performance dashboard server and frontend for nightly CUDA tests#17725
Kangyan-Zhou merged 5 commits intosgl-project:mainfrom
Kangyan-Zhou:performance_dashboard

Conversation

@Kangyan-Zhou
Copy link
Copy Markdown
Collaborator

Motivation

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

Review Process

  1. Ping Merge Oncalls to start the PR flow. See the PR Merge Process.
  2. Get approvals from CODEOWNERS and other reviewers.
  3. Trigger CI tests with comments or contact authorized users to do so.
    • /tag-run-ci-label, /rerun-failed-ci, /tag-and-rerun-ci
  4. After green CI and required approvals, ask Merge Oncalls to merge.

@github-actions github-actions Bot added the documentation Improvements or additions to documentation label Jan 25, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @Kangyan-Zhou, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request establishes a dedicated performance dashboard for SGLang's nightly CUDA tests. Its primary goal is to offer a clear, interactive platform for monitoring and analyzing key performance indicators like throughput, latency, and TTFT over time. This will enable developers to quickly identify performance regressions, track improvements, and make data-driven decisions regarding model and configuration optimizations.

Highlights

  • New Performance Dashboard: Introduces a web-based dashboard for visualizing SGLang nightly CUDA test performance metrics, providing a dedicated tool for performance monitoring.
  • Automated Data Fetching: Includes Python scripts (fetch_metrics.py, server.py) to automatically fetch workflow runs and consolidated performance artifacts from GitHub Actions, handling authentication and data extraction.
  • Interactive Frontend Visualization: Provides an interactive web interface (index.html, app.js) utilizing Chart.js to display performance trends for throughput, latency, and Time to First Token (TTFT) over time.
  • Filtering and Model Comparison: Enables users to filter performance data by GPU configuration, model, variant, and batch size, facilitating detailed analysis and comparison across different configurations.
  • Flexible Deployment Options: Outlines instructions for deploying the dashboard to GitHub Pages for public access or self-hosting using the provided Python development server, offering versatility in setup.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive performance dashboard for visualizing nightly CUDA test metrics, including a frontend, a Python server for serving data, and a script for fetching metrics from GitHub. The implementation is well-structured. My review focuses on improving the robustness and security of the new components. I've identified a potential cross-site scripting (XSS) vulnerability in the JavaScript frontend, a UI bug related to the filtering logic, and a missing timeout in the Python data fetching script that could affect its reliability. The proposed changes will enhance security and fix the identified bugs.

Comment thread docs/performance_dashboard/app.js Outdated
Comment on lines +231 to +249
statsRow.innerHTML = `
<div class="stat-card">
<div class="label">Total Runs</div>
<div class="value">${allMetricsData.length}</div>
</div>
<div class="stat-card">
<div class="label">Models Tested</div>
<div class="value">${totalModels}</div>
</div>
<div class="stat-card">
<div class="label">Benchmarks</div>
<div class="value">${totalBenchmarks}</div>
</div>
<div class="stat-card">
<div class="label">Peak Throughput</div>
<div class="value">${formatNumber(maxThroughput)}</div>
<div class="change">${maxThroughputModel}</div>
</div>
`;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The updateStats function uses innerHTML to render statistics, including maxThroughputModel which is derived from API data. This could introduce a cross-site scripting (XSS) vulnerability if a model name contains malicious HTML. To prevent this, it's safer to build the DOM elements programmatically and use textContent to set their content.

    statsRow.innerHTML = ''; // Clear previous stats

    const addStat = (label, value, change) => {
        const card = document.createElement('div');
        card.className = 'stat-card';
        const labelEl = document.createElement('div');
        labelEl.className = 'label';
        labelEl.textContent = label;
        const valueEl = document.createElement('div');
        valueEl.className = 'value';
        valueEl.textContent = value;
        card.appendChild(labelEl);
        card.appendChild(valueEl);
        if (change) {
            const changeEl = document.createElement('div');
            changeEl.className = 'change';
            changeEl.textContent = change;
            card.appendChild(changeEl);
        }
        statsRow.appendChild(card);
    };

    addStat('Total Runs', allMetricsData.length);
    addStat('Models Tested', totalModels);
    addStat('Benchmarks', totalBenchmarks);
    addStat('Peak Throughput', formatNumber(maxThroughput), maxThroughputModel);

if event:
params["event"] = event

response = requests.get(url, headers=get_headers(token), params=params)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The call to requests.get is missing a timeout. This could cause the script to hang indefinitely if the GitHub API is slow or unresponsive. It's a good practice to always include a timeout for network requests for robustness.

Please add a timeout to this call and other requests.get calls in this file (e.g., lines 101, 116, 188). A timeout of 30 seconds is a reasonable starting point.

Suggested change
response = requests.get(url, headers=get_headers(token), params=params)
response = requests.get(url, headers=get_headers(token), params=params, timeout=30)

Comment thread docs/performance_dashboard/index.html Outdated
</div>
<div class="filter-group">
<label>Model</label>
<select id="model-filter" onchange="updateCharts()">
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The onchange handler for the model filter dropdown directly calls updateCharts(). This creates a bug: if a model tab is already active, changing the dropdown won't update the charts because the application logic prioritizes the tab selection. This also leads to the UI being out of sync (dropdown and tabs show different models).

To fix this, the onchange event should call a new handler function that synchronizes the selected model with the tabs.

First, update the onchange attribute here. Then, add the following helper function to app.js:

function handleModelFilterChange(model) {
    // Find the corresponding tab and activate it to keep UI in sync
    const tab = Array.from(document.querySelectorAll('.tab')).find(t => t.title === model || (model === 'all' && t.textContent === 'All Models'));
    if (tab) {
        selectModelTab(model, tab);
    }
}
Suggested change
<select id="model-filter" onchange="updateCharts()">
<select id="model-filter" onchange="handleModelFilterChange(this.value)">

Kangyan-Zhou and others added 4 commits January 25, 2026 16:06
- Add benchmarks_by_io_len nested structure in metrics for grouping by
  input/output length combinations (with backward compatibility)
- Add Input/Output Length dropdown filter to dashboard
- Add Output Throughput metric tab
- Add Accept Length metric tab (filters invalid -1/null values)
- Add defensive checks for null variants, malformed IO keys, and DOM elements
- Support numeric sorting for IO length options

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@Kangyan-Zhou Kangyan-Zhou marked this pull request as ready for review January 28, 2026 06:22
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@Kangyan-Zhou Kangyan-Zhou merged commit c0b4dd6 into sgl-project:main Jan 28, 2026
58 of 60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant