Skip to content

common/TrackedOp: rename and raise prio of slow op perfcounter#52068

Merged
yuriw merged 1 commit intoceph:mainfrom
YiteGu:fix-trackedop-perfcounter
Jul 31, 2024
Merged

common/TrackedOp: rename and raise prio of slow op perfcounter#52068
yuriw merged 1 commit intoceph:mainfrom
YiteGu:fix-trackedop-perfcounter

Conversation

@YiteGu
Copy link
Member

@YiteGu YiteGu commented Jun 15, 2023

  1. TrackedOp is a common class, mon and mds alse use it.
  2. Set prio level of l_trackedop_slow_op_count to PRIO_USEFUL, we can get change of slow op of daemon from prometheus metrics, it's very interesting

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@YiteGu
Copy link
Member Author

YiteGu commented Jun 15, 2023

# ceph -c ceph.conf daemon asok/osd.0.asok perf dump | grep -A 5 trackedop
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
    "trackedop": {
        "slow_ops_count": 0
    }
}
# ceph -c ceph.conf daemon asok/mon.a.asok perf dump | grep -A 5 trackedop
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
    "trackedop": {
        "slow_ops_count": 0
    }
}
# ./ceph -c ceph.conf daemon asok/mds.c.asok perf dump | grep -A 5 trackedop
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
    "trackedop": {
        "slow_ops_count": 0
    }
}

@YiteGu
Copy link
Member Author

YiteGu commented Jun 15, 2023

from metrics:

# HELP ceph_trackedop_slow_ops_count Number of operations taking over ten second
# TYPE ceph_trackedop_slow_ops_count counter
ceph_trackedop_slow_ops_count{ceph_daemon="mds.c"} 0.0
ceph_trackedop_slow_ops_count{ceph_daemon="mon.a"} 0.0
ceph_trackedop_slow_ops_count{ceph_daemon="mon.b"} 0.0
ceph_trackedop_slow_ops_count{ceph_daemon="mon.c"} 0.0
ceph_trackedop_slow_ops_count{ceph_daemon="osd.0"} 0.0
ceph_trackedop_slow_ops_count{ceph_daemon="osd.1"} 0.0
ceph_trackedop_slow_ops_count{ceph_daemon="osd.2"} 0.0

@YiteGu
Copy link
Member Author

YiteGu commented Jun 15, 2023

@rzarzynski @athanatos have a code review, urgent

1. TrackedOp is a common class, mon and mds alse use it.
2. Set prio level of l_trackedop_slow_op_count to PRIO_USEFUL,
   we can get change of slow op of daemon from prometheus metrics,
   it's very interesting

Signed-off-by: Yite Gu <yitegu0@gmail.com>
@athanatos
Copy link
Contributor

So, the main change here is to set the priority for l_osd_slow_op_count to PRIO_USEFUL. Seems reasonable to me. @rzarzynski @batrick Does this seem reasonable to you?

Copy link
Member

@batrick batrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rewrite commit title as: common/TrackedOp: rename and raise prio of slow op perfcounter

@batrick
Copy link
Member

batrick commented Jun 20, 2023

So, the main change here is to set the priority for l_osd_slow_op_count to PRIO_USEFUL. Seems reasonable to me. @rzarzynski @batrick Does this seem reasonable to you?

Yup!

Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming the Patrick's review gets addressed.

@YiteGu YiteGu force-pushed the fix-trackedop-perfcounter branch from 54bcdc1 to e1a0284 Compare June 21, 2023 01:20
@YiteGu YiteGu changed the title osd/TrackedOp: Redefine TrackedOp perfcounter common/TrackedOp: rename and raise prio of slow op perfcounter Jun 21, 2023
@YiteGu
Copy link
Member Author

YiteGu commented Jun 21, 2023

@batrick Rewrite commit title done

@YiteGu YiteGu requested a review from batrick June 21, 2023 01:24
@YiteGu
Copy link
Member Author

YiteGu commented Aug 1, 2023

results show
截屏2023-08-01 下午1 56 15

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Oct 30, 2023
@YiteGu
Copy link
Member Author

YiteGu commented Oct 31, 2023

Hi @ljflores this pr need testing

@github-actions github-actions bot removed the stale label Oct 31, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Dec 30, 2023
@athanatos athanatos removed the stale label Jan 2, 2024
@github-actions
Copy link

github-actions bot commented Mar 2, 2024

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Mar 2, 2024
@github-actions
Copy link

github-actions bot commented Apr 1, 2024

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Apr 1, 2024
@YiteGu
Copy link
Member Author

YiteGu commented Jul 1, 2024

@rzarzynski I think we need to fix this problem, otherwise, the slow op of mon and mds will be represented by the keyword osd, this will cause ambiguity

@batrick batrick reopened this Jul 22, 2024
@batrick batrick added core and removed stale labels Jul 22, 2024
@batrick
Copy link
Member

batrick commented Jul 22, 2024

ping @ljflores @rzarzynski @yuriw please QA.

@rzarzynski
Copy link
Contributor

jenkins test make check

@rzarzynski
Copy link
Contributor

needs-qa is already assigned.

@yuriw, @ljflores: let's take this into the very next QA batch. Pinging also in Slack.

@rzarzynski
Copy link
Contributor

Looks unrelated at first glance:

Traceback (most recent call last):
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/bin/nodeenv", line 8, in 
    sys.exit(main())
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 1130, in main
    create_environment(env_dir, args)
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 1006, in create_environment
    install_node(env_dir, src_dir, args)
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 763, in install_node
    install_node_wrapped(env_dir, src_dir, args)
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 786, in install_node_wrapped
    download_node_src(node_url, src_dir, args)
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 618, in download_node_src
    dl_contents = _download_node_file(node_url)
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 610, in _download_node_file
    raise e
  File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/src/pybind/mgr/dashboard/frontend/node-env/lib/python3.10/site-packages/nodeenv.py", line 602, in _download_node_file
    return io.BytesIO(urlopen(node_url).read())
  File "/usr/lib/python3.10/http/client.py", line 482, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python3.10/http/client.py", line 633, in _safe_read
    raise IncompleteRead(data, amt-len(data))
http.client.IncompleteRead: IncompleteRead(12623872 bytes read, 34097368 more expected)

@rzarzynski
Copy link
Contributor

jenkins test make check

@ljflores
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants