ci: unit test for srt/observability module#21002
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a comprehensive set of unit tests for the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive suite of unit tests for the srt/observability module, covering cpu_monitor, func_timer, label_transform, req_time_stats, startup_func_log_and_timer, and trace. The tests are well-structured, leveraging mocks and stubs effectively to ensure deterministic and lightweight execution. This is a great step towards improving the robustness and maintainability of the observability components. My feedback is minor and focuses on improving code style by centralizing imports.
srt/observability module
556b959 to
c41c699
Compare
Remove some comments
c41c699 to
16fd19c
Compare
|
Hi @ispobock @ping1jing2 @iforgetmyname @Edwardf0t1, thanks for reviewing! Happy to iterate on any feedback. Looking forward to contributing more to SGLang. |
|
/tag-and-rerun-ci |
Hi @ispobock, Thanks for helping run the CI. 8 jobs failed but only 2 are from this PR (both already fixed and pushed): My fixes:
Unrelated to this PR:
|
|
|
||
| _ensure_module("sglang.srt.model_executor") | ||
| _fbi = _ensure_module("sglang.srt.model_executor.forward_batch_info") | ||
| _fbi.ForwardMode = _ForwardMode |
There was a problem hiding this comment.
how to handle the stub drift if the ForwardMode in forward_batch_info changed or refactored?
There was a problem hiding this comment.
Thanks for the call out, this is a good point. The stub drift risk is real but bounded - if ForwardMode changes, req_time_stats.py behavior would change too, and the affected tests would need updating at that point, which would surface the stale stub.
That said, the stubs exist only because forward_batch_info.py transitively pulls in torch via sglang.srt.distributed, which isn't available in the CPU-only test environment. If ForwardMode (a simple IntEnum) were ever extracted into a lightweight module without heavy deps, we could drop the stub entirely and import the real class. Until then, I would suggest to keep the stub minimal to reduce the drift surface.
To avoid burden of communication, I added a single comment pointing to the source of truth. When ForwardMode changes, the developer updating req_time_stats.py will see the stub, see the comment, and know to update it. I think it is how stubs work with heavy deps.
Motivation
Issue
#20865
srt/observability/:label_transform,cpu_monitor,func_timer,req_time_stats,request_metrics_exporter,startup_func_log_and_timer,traceCoverage
label_transform.pycpu_monitor.pyfunc_timer.pyutils.pystartup_func_log_and_timer.pyreq_time_stats.pyrequest_metrics_exporter.pytrace.pyTest
Short Version
Full
Two warning are irrelevant with this PR:
Accuracy Tests
Benchmarking and Profiling
Checklist
Review Process
/tag-run-ci-label,/rerun-failed-ci,/tag-and-rerun-ci