Skip to content

[Metrics] Add Prometheus counters for Model FLOPs Utilization (MFU)#30950

Merged
markmc merged 2 commits intovllm-project:mainfrom
markmc:mfu-metrics-prometheus
Feb 23, 2026
Merged

[Metrics] Add Prometheus counters for Model FLOPs Utilization (MFU)#30950
markmc merged 2 commits intovllm-project:mainfrom
markmc:mfu-metrics-prometheus

Conversation

@markmc
Copy link
Copy Markdown
Member

@markmc markmc commented Dec 18, 2025

See #30738 - this is a follow-on to export these metrics via Prometheus in addition to the console logging

The metrics are only calculated and available with --enable-mfu-metrics

@markmc markmc requested a review from hmellor as a code owner December 18, 2025 08:41
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Dec 18, 2025

Documentation preview: https://vllm--30950.org.readthedocs.build/en/30950/

@mergify mergify Bot added documentation Improvements or additions to documentation v1 labels Dec 18, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully adds Prometheus counters for Model FLOPs Utilization (MFU) metrics, making them available for monitoring. The changes are well-integrated with the existing metrics system, including support for Ray environments. The documentation has also been updated accordingly. I've identified one area for improvement related to code duplication that would enhance maintainability.

Comment thread vllm/v1/metrics/perf.py
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Dec 18, 2025

Hi @markmc, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?
mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:
# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

@markmc markmc added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 18, 2025
@markmc markmc requested a review from zhuohan123 December 19, 2025 13:57
@markmc markmc moved this from Backlog to In Review in Metrics & Tracing Dec 19, 2025
@markmc markmc moved this from In Review to Ready in Metrics & Tracing Dec 19, 2025
@chaunceyjiang
Copy link
Copy Markdown
Collaborator

hi @markmc any updates?

@hmellor
Copy link
Copy Markdown
Member

hmellor commented Jan 2, 2026

It looks like all the tests passed but it's not been reviewed

@hmellor
Copy link
Copy Markdown
Member

hmellor commented Jan 2, 2026

I don't see anything that adds --enable-mfu-metrics though

@markmc
Copy link
Copy Markdown
Member Author

markmc commented Jan 13, 2026

It looks like all the tests passed but it's not been reviewed

Yes, this is ready for review/merge

I don't see anything that adds --enable-mfu-metrics though

This was added by #30738 which just added console logging of these metrics. This PR follows-on to add Prometheus support

Copy link
Copy Markdown
Contributor

@hickeyma hickeyma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @markmc, it looks good bar a few small nits. Should unit test be added for PerfMetricsProm class?

Comment thread vllm/v1/metrics/loggers.py
Comment thread vllm/v1/metrics/perf.py
Copy link
Copy Markdown
Contributor

@hickeyma hickeyma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @markmc

@markmc markmc force-pushed the mfu-metrics-prometheus branch 3 times, most recently from c69c0c5 to bd84989 Compare February 5, 2026 07:52
@markmc markmc force-pushed the mfu-metrics-prometheus branch from bd84989 to 76f98f9 Compare February 19, 2026 08:14
Copy link
Copy Markdown
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise the docs build will complain that perf.md can't be navigated to

Comment thread docs/mkdocs/hooks/generate_metrics.py Outdated
Comment thread docs/usage/metrics.md Outdated
markmc and others added 2 commits February 23, 2026 13:21
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Mark McLoughlin <markbmc@gmail.com>
@markmc markmc force-pushed the mfu-metrics-prometheus branch from 57df7a6 to fb98350 Compare February 23, 2026 13:21
@markmc markmc enabled auto-merge (squash) February 23, 2026 13:26
@markmc markmc merged commit 5cc7c44 into vllm-project:main Feb 23, 2026
46 checks passed
@github-project-automation github-project-automation Bot moved this from Ready to Done in Metrics & Tracing Feb 23, 2026
Copilot AI pushed a commit to machov/vllm that referenced this pull request Mar 10, 2026
…llm-project#30950)

Export the existing Model FLOPs Utilization (MFU) metrics via Prometheus.

`--enable-mfu-metrics` is required for these to be exposed.

Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
jiangkuaixue123 pushed a commit to jiangkuaixue123/vllm that referenced this pull request Apr 28, 2026
…llm-project#30950)

Export the existing Model FLOPs Utilization (MFU) metrics via Prometheus.

`--enable-mfu-metrics` is required for these to be exposed.

Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
…llm-project#30950)

Export the existing Model FLOPs Utilization (MFU) metrics via Prometheus.

`--enable-mfu-metrics` is required for these to be exposed.

Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants