Skip to content

[CI] Fix AMD CI by inlining dummy_grok config#18044

Merged
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
sunxxuns:fix-amd-ci-dummy-grok-blob
Feb 1, 2026
Merged

[CI] Fix AMD CI by inlining dummy_grok config#18044
Kangyan-Zhou merged 1 commit intosgl-project:mainfrom
sunxxuns:fix-amd-ci-dummy-grok-blob

Conversation

@sunxxuns
Copy link
Copy Markdown
Collaborator

Motivation

The Azure blob storage endpoint (sharkpublic.blob.core.windows.net) is returning 403 Forbidden errors, causing all AMD CI jobs to fail at the "Install dependencies" step:

--2026-01-31 10:11:33--  https://sharkpublic.blob.core.windows.net/sharkpublic/sglang/dummy_grok.json
HTTP request sent, awaiting response... 403 This request is not authorized to perform this operation.

This is affecting multiple PRs including #18026.

Modifications

  • Remove the wget download from Azure blob storage
  • Create the config.json inline using a heredoc
  • Config content sourced from existing documentation (3rdparty/amd/profiling/PROFILING.md)

Checklist

Made with Cursor

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @sunxxuns, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a recurring issue in the AMD continuous integration pipeline where jobs were failing due to an inability to access a necessary configuration file hosted on Azure blob storage. The change stabilizes the CI by embedding the configuration directly into the installation script, thereby removing a brittle external dependency and ensuring the reliable execution of AMD CI jobs.

Highlights

  • CI Fix for AMD Jobs: This pull request resolves a critical failure in AMD CI jobs caused by a '403 Forbidden' error when attempting to download dummy_grok.json from Azure blob storage.
  • Inlining Configuration: The dummy_grok.json configuration content is now inlined directly into the amd_ci_install_dependency.sh script using a heredoc, eliminating the need for an external wget command.
  • Bypassing Azure Blob Storage Issues: By embedding the configuration, the CI process bypasses the problematic Azure blob storage endpoint, ensuring the installation step completes successfully.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request resolves a CI failure in AMD jobs by inlining a dummy grok configuration file that was previously failing to download. The approach of using a heredoc is a direct and effective fix. I have one suggestion to refine the implementation by piping the configuration directly into the Docker container, which would make the script slightly cleaner by avoiding the creation of temporary files on the host.

Comment on lines +128 to +151
mkdir -p dummy-grok
cat > dummy-grok/config.json << 'EOF'
{
"architectures": [
"Grok1ModelForCausalLM"
],
"embedding_multiplier_scale": 78.38367176906169,
"output_multiplier_scale": 0.5773502691896257,
"vocab_size": 131072,
"hidden_size": 6144,
"intermediate_size": 32768,
"max_position_embeddings": 8192,
"num_experts_per_tok": 2,
"num_local_experts": 8,
"num_attention_heads": 48,
"num_hidden_layers": 64,
"num_key_value_heads": 8,
"head_dim": 128,
"rms_norm_eps": 1e-05,
"rope_theta": 10000.0,
"model_type": "mixtral",
"torch_dtype": "bfloat16"
}
EOF
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of creating a temporary directory and file on the host and then copying them to the container, you can pipe the configuration directly into a file inside the container. This approach is cleaner as it avoids creating temporary artifacts on the host.

Important: If you apply this suggestion, you must also manually remove the docker cp ./dummy-grok ci_sglang:/ command on line 152, as it will no longer be needed and will cause an error.

cat << 'EOF' | docker exec -i ci_sglang sh -c 'cat > /dummy-grok/config.json'
{
  "architectures": [
    "Grok1ModelForCausalLM"
  ],
  "embedding_multiplier_scale": 78.38367176906169,
  "output_multiplier_scale": 0.5773502691896257,
  "vocab_size": 131072,
  "hidden_size": 6144,
  "intermediate_size": 32768,
  "max_position_embeddings": 8192,
  "num_experts_per_tok": 2,
  "num_local_experts": 8,
  "num_attention_heads": 48,
  "num_hidden_layers": 64,
  "num_key_value_heads": 8,
  "head_dim": 128,
  "rms_norm_eps": 1e-05,
  "rope_theta": 10000.0,
  "model_type": "mixtral",
  "torch_dtype": "bfloat16"
}
EOF

The Azure blob storage endpoint (sharkpublic.blob.core.windows.net)
is returning 403 Forbidden errors, causing all AMD CI jobs to fail
at the "Install dependencies" step.

This fix removes the wget download and creates the config.json inline,
eliminating the dependency on external blob storage.

Co-authored-by: Cursor <cursoragent@cursor.com>
@sunxxuns sunxxuns force-pushed the fix-amd-ci-dummy-grok-blob branch from 7bfc49d to 7e0db9d Compare January 31, 2026 20:01
@github-actions github-actions Bot added the diffusion SGLang Diffusion label Jan 31, 2026
@Kangyan-Zhou Kangyan-Zhou merged commit 47592a2 into sgl-project:main Feb 1, 2026
107 of 110 checks passed
charlesHsuGG pushed a commit to charlesHsuGG/sglang that referenced this pull request Feb 2, 2026
Co-authored-by: root <root@mi300x8-005.atl1.do.cpe.ice.amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
sfiisf pushed a commit to sfiisf/sglang that referenced this pull request Feb 5, 2026
Co-authored-by: root <root@mi300x8-005.atl1.do.cpe.ice.amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Johnsonms pushed a commit to Johnsonms/sglang that referenced this pull request Feb 14, 2026
Co-authored-by: root <root@mi300x8-005.atl1.do.cpe.ice.amd.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants