mds: dump locks when printing mutation ops#52547
Conversation
lxbsz
left a comment
There was a problem hiding this comment.
Verrrrrrry helpful improvements.
|
Actually, it just occurred to me that this could create infinite recursion due to the "xlock_by" field in the lock. I'll have to think about how we should handle that. Probably, it's a good idea to just print the request string. |
5a205bd to
dca585a
Compare
| f->open_object_section("xlock_by"); | ||
| if (get_xlock_by()) { | ||
| get_xlock_by()->dump(f); | ||
| if (auto mut = get_xlock_by(); mut) { |
There was a problem hiding this comment.
nit: Maybe this should just be:
if (auto mut = get_xlock_by(); mut != MutationRef()) {
to avoid dumping just an instance of MutationRef()?
There was a problem hiding this comment.
The intrusive pointer MutationRef() should be parsed as nullptr already IMO.
There was a problem hiding this comment.
I think so too, a null MutationRef() should be boolean false?
There was a problem hiding this comment.
hmm.. there are two kinds of checks then - one which checks for nullptr and the other for MutationRef() instance. It's a bit confusing (esp. since its a shared_ptr).
There was a problem hiding this comment.
Sorry, you're confusing me Venky. Can you give an example of what you're concerned about?
There was a problem hiding this comment.
Sorry, you're confusing me Venky. Can you give an example of what you're concerned about?
Ok. I haven't checked this closely, but in some places there is a mix of checks for the value returned by get_xlock_by() - nullptr vs MutationRef() and I wasn't 100% sure if the results of those checks are the same.
|
Changes: fairly extensive rework to allow dumping ops while holding the mds_lock. The "ops" asok command now accepts a |
56b36e9 to
a52fac8
Compare
|
Integration branch - https://shaman.ceph.com/builds/ceph/wip-vshankar-testing-20230808.093601/ Test runs in ~3 hrs. |
|
@vshankar I just realized there is a stray empty "test" file in the first commit that got added by accident. I'll rebase + remove that. |
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
It is not safe to "cache" a member which may be changed by a racing thread. This reworks the locking so we can do a light-weight check if the description is already generated wihtout acquiring the heavier TrackedOp::lock. If it's not available yet or it needs regenerated, then get the proper locks to generate it. Fixes: e45f5c2 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Also: fix return to std::string. A string_view is not a thread-safe return. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
The MDS does not generally bother locking a Mutation before changing anything so this lock provides weak protection. In any case, try to improve on that... Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
TrackedOp::lock is already suitable for this purpose. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Otherwise you can't easily know which lock it is in the corresponding MDSCacheObject. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
This avoids infinite recursion when dumping locks with xlock_by non-null. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
When dumping an op, it may be desirable to alter how it is dumped depending on which locks are held. As it happens, I plan to dump extra information if the mds_lock is held! Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
For debugging where an operation may be stuck. Fixes: https://tracker.ceph.com/issues/62086 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
To test they work, not that the output is useful. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com>
|
Looks like:
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "pacific"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows