osd: avoid two copy with same src cancel each other by mychoxin · Pull Request #39593 · ceph/ceph

mychoxin · 2021-02-21T09:37:55Z

osd: avoid two copy with same src cancel each other and run into dead loop

For cache tier, if some head object has two snaps, the two snaps share the same clone object,
and the clone object was flush/evicted from cache pool, when a rollback requests and a read
snap request to these two snaps at the same time will generate two promote requests to the
same clone object, these two promote requests will generate two copy ops with same src, than
the second copy op will cancel the first copy op by calling cancel_copy and kick_object_context_blocked,
but after calling kick_object_context_blocked, a new promote request corresponding to first
copy op will be restarted and generate a new copy op, the new copy op will cancel the second
copy op again, so two promote requests will cancel their copy op each other and run into dead
loop.

Fixes: https://tracker.ceph.com/issues/49409

Signed-off-by: YuanXin yuanxin@didiglobal.com
Signed-off-by: mychoxin mychoxin@gmail.com

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

mychoxin · 2021-02-21T12:08:30Z

@tchaikov please help to have a look

mychoxin · 2021-02-22T08:24:39Z

@liewegas please help to have a look

liewegas · 2021-02-22T14:41:23Z

I wonder if it would be cleaner to pull the kick_object_context_blocked(cop->obc); into the callers? There are only two:

start_copy wouldn't kick
cancel_copy_ops would. (It could be renamed cancel_and_kick_copy_ops for clarity)

mychoxin · 2021-02-23T05:00:16Z

@liewegas good idea, i have done, please help to review.

liewegas · 2021-02-23T15:12:02Z

Code looks good. Is this a case you can reproduce? It would be great to add a test for it

mychoxin · 2021-02-24T12:08:22Z

Let me try it.

mychoxin · 2021-02-26T13:41:01Z

@liewegas I reproduced it and modified the description, not two rollback op, the first op is rollback and the second is read snap. Please help to review.

liewegas

This looks right to me!

mychoxin · 2021-02-27T13:24:03Z

@neha-ojha Please help to rewiew, thx.

mychoxin · 2021-02-27T15:42:12Z

@tchaikov I have rebased it, is that ok?

tchaikov · 2021-02-27T15:43:48Z

@mychoxin thanks! looks great!

neha-ojha

makes sense to me, @myoungwon @athanatos WDYT?

nit: the description in #39593 (comment) looks great, can we please add the same to the commit description as well?

For cache tier, if some head object has two snaps, the two snaps share the same clone object, and the clone object was flush/evicted from cache pool, when a rollback requests and a read snap request to these two snaps at the same time will generate two promote requests to the same clone object, these two promote requests will generate two copy ops with same src, than the second copy op will cancel the first copy op by calling cancel_copy and kick_object_context_blocked, but after calling kick_object_context_blocked, a new promote request corresponding to first copy op will be restarted and generate a new copy op, the new copy op will cancel the second copy op again, so two promote requests will cancel their copy op each other and run into dead loop. Fixes: https://tracker.ceph.com/issues/49409 Signed-off-by: YuanXin <yuanxin@didiglobal.com>

mychoxin · 2021-03-01T23:56:31Z

makes sense to me, @myoungwon @athanatos WDYT?

nit: the description in #39593 (comment) looks great, can we please add the same to the commit description as well?

done

athanatos · 2021-03-02T00:00:57Z

LGTM

myoungwon · 2021-03-02T01:26:01Z

lgtm

mychoxin · 2021-03-05T13:32:16Z

@liewegas Is there any question?

tchaikov · 2021-03-05T13:38:08Z

@mychoxin it's just pending on a rados suite run.

tchaikov · 2021-03-07T16:09:47Z

failures tracked by

i am not quite sure about

though.

will rerun these tests without this change.

mychoxin · 2021-03-08T05:34:56Z

ok, it tells ceph_test_rados Crashed, but no details show where it crashed at.

athanatos · 2021-03-08T06:13:31Z

@myoungwon Can you help @mychoxin interpret the test failure?

tchaikov · 2021-03-08T06:26:17Z

@mychoxin @myoungwon following is an excerpt from /a//kchai-2021-03-07_12:30:50-rados-wip-kefu-testing-2021-03-07-1842-distro-basic-smithi/5944406/teuthology.log

2021-03-07T14:01:07.092 DEBUG:teuthology.orchestra.run.smithi085:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.4.asok dump_ops_in_flight
2021-03-07T14:01:07.093 INFO:tasks.rados.rados.0.smithi083.stdout:2716:  finishing copy_from racing read to smithi08322634-40
2021-03-07T14:01:07.093 INFO:tasks.rados.rados.0.smithi083.stdout:2725:  finishing rollback tid 0 to smithi08322634-26
2021-03-07T14:01:07.093 INFO:tasks.rados.rados.0.smithi083.stdout:2716:  finishing copy_from to smithi08322634-40
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr:/build/ceph-17.0.0-1695-gd5f415b9/src/test/osd/RadosModel.h: In function 'virtual void CopyFromOp::_finish(TestOp::CallbackInfo*)' thread 7f5da9b02700 time 2021-03-07T14:01:06.784131+0000
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr:/build/ceph-17.0.0-1695-gd5f415b9/src/test/osd/RadosModel.h: 1938: FAILED ceph_assert(!version || comp->get_version64() == version)
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr: ceph version 17.0.0-1695-gd5f415b9 (d5f415b9b67a6d66ef39db91322ef188bb9f833e) quincy (dev)
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f5daff7f29d]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f5daff7f478]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 3: (CopyFromOp::_finish(TestOp::CallbackInfo*)+0x52b) [0x5627f03acefb]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 4: (write_callback(void*, void*)+0x19) [0x5627f03cb9a9]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 5: /usr/lib/librados.so.2(+0x99fc6) [0x7f5db8c6bfc6]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 6: /usr/lib/librados.so.2(+0xb3c25) [0x7f5db8c85c25]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 7: /usr/lib/librados.so.2(+0xb695f) [0x7f5db8c8895f]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 8: /usr/lib/librados.so.2(+0xb6c96) [0x7f5db8c88c96]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 9: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbd6df) [0x7f5daf8326df]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 10: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f5dafb056db]

myoungwon · 2021-03-08T10:09:11Z

@athanatos @tchaikov I'll take a look.

tchaikov · 2021-03-08T10:45:02Z

@mychoxin @myoungwon i am not able to reproduce the failure when rerunning of the failed test

https://pulpito.ceph.com/kchai-2021-03-08_09:56:58-rados-wip-kefu2-testing-2021-03-08-1335-distro-basic-smithi/

tchaikov · 2021-03-08T11:43:40Z

this change is not related: https://pulpito.ceph.com/kchai-2021-03-08_10:48:51-rados-wip-kefu2-testing-2021-03-08-1335-distro-basic-smithi/

neha-ojha · 2021-03-11T23:01:44Z

@mychoxin @myoungwon following is an excerpt from /a//kchai-2021-03-07_12:30:50-rados-wip-kefu-testing-2021-03-07-1842-distro-basic-smithi/5944406/teuthology.log

2021-03-07T14:01:07.092 DEBUG:teuthology.orchestra.run.smithi085:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 30 ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.4.asok dump_ops_in_flight
2021-03-07T14:01:07.093 INFO:tasks.rados.rados.0.smithi083.stdout:2716:  finishing copy_from racing read to smithi08322634-40
2021-03-07T14:01:07.093 INFO:tasks.rados.rados.0.smithi083.stdout:2725:  finishing rollback tid 0 to smithi08322634-26
2021-03-07T14:01:07.093 INFO:tasks.rados.rados.0.smithi083.stdout:2716:  finishing copy_from to smithi08322634-40
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr:/build/ceph-17.0.0-1695-gd5f415b9/src/test/osd/RadosModel.h: In function 'virtual void CopyFromOp::_finish(TestOp::CallbackInfo*)' thread 7f5da9b02700 time 2021-03-07T14:01:06.784131+0000
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr:/build/ceph-17.0.0-1695-gd5f415b9/src/test/osd/RadosModel.h: 1938: FAILED ceph_assert(!version || comp->get_version64() == version)
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr: ceph version 17.0.0-1695-gd5f415b9 (d5f415b9b67a6d66ef39db91322ef188bb9f833e) quincy (dev)
2021-03-07T14:01:07.094 INFO:tasks.rados.rados.0.smithi083.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f5daff7f29d]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7f5daff7f478]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 3: (CopyFromOp::_finish(TestOp::CallbackInfo*)+0x52b) [0x5627f03acefb]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 4: (write_callback(void*, void*)+0x19) [0x5627f03cb9a9]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 5: /usr/lib/librados.so.2(+0x99fc6) [0x7f5db8c6bfc6]
2021-03-07T14:01:07.095 INFO:tasks.rados.rados.0.smithi083.stderr: 6: /usr/lib/librados.so.2(+0xb3c25) [0x7f5db8c85c25]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 7: /usr/lib/librados.so.2(+0xb695f) [0x7f5db8c8895f]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 8: /usr/lib/librados.so.2(+0xb6c96) [0x7f5db8c88c96]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 9: /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbd6df) [0x7f5daf8326df]
2021-03-07T14:01:07.096 INFO:tasks.rados.rados.0.smithi083.stderr: 10: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7f5dafb056db]

https://tracker.ceph.com/issues/49726 is showing up in almost every rados run on master after this PR merged. @mychoxin can you please address this?

tchaikov · 2021-03-12T07:32:16Z

#40057 is created before we have a fix.

athanatos · 2021-03-12T07:37:24Z

If it's that common, probably worth just reverting it.

tchaikov · 2021-03-12T07:42:31Z

@athanatos pretty reproducible . see https://pulpito.ceph.com/kchai-2021-03-12_05:42:44-rados-wip-kefu-testing-2021-03-12-1106-distro-basic-smithi/5958481/.

just added Signed-off-by and Fixes tags to #40057. and will test it in my next batch.

myoungwon · 2021-03-12T13:18:03Z

@athanatos @tchaikov @neha-ojha
I think I found the root cause. Can you double-check this?
This seems like the sequence inversion because kick_object_context_blocked is moved to cancel_copy_ops().

Please look at the following.
During copy-from op in ceph_test_rados, it also issues the stat op to check racing read. So,

copy-from to OID A
stat to OID A
The osd receives copy-from, then issue copy request to the target
Stat is received, but it is blocked.
Something happens, so recovery is triggered
cancel_copy_ops() is called, so copy-from is canceled, then re-queued

At this point, the original code invokes kick_object_context_blocked(), then cop->cb->complete() in order.
So, this causes re-queueing blocked op by using enqueue_front(). In other words, kick_object_context_blocked()
enqueue old operation, and cop->cb->complete() enqueue current operation by using enqueue_front(). (note that this is
enqueue_front). But, with this PR, the functions are called in reverse order, which means the stat (as I described above) will be served before handling copy-from because it enqueues the first position in the queue.

So, I think we can avoid this and resolve the issue this PR posted via #40067 ?

tchaikov · 2021-03-12T13:42:55Z

@mychoxin just reverted this change in #40057

@myoungwon thanks for looking into this. at first glance, your analysis makes sense to me! but before your fix lands on master, i think we'd better address the test failure first. also, could you include this change in #40067? so we can test them at the same time?

myoungwon · 2021-03-12T14:20:58Z

@tchaikov ok.

github-actions Bot added the core label Feb 21, 2021

mychoxin force-pushed the fix_promote_dead_loop branch 2 times, most recently from c24105e to c2318c6 Compare February 23, 2021 04:56

liewegas changed the title ~~avoid two copy with same src cancel each other~~ osd: avoid two copy with same src cancel each other Feb 23, 2021

liewegas added the bug-fix label Feb 23, 2021

github-actions Bot added the tests label Feb 26, 2021

mychoxin force-pushed the fix_promote_dead_loop branch from a972035 to 2296f83 Compare February 26, 2021 13:32

liewegas approved these changes Feb 26, 2021

View reviewed changes

liewegas requested a review from neha-ojha February 26, 2021 13:54

This comment has been minimized.

Sign in to view

mychoxin force-pushed the fix_promote_dead_loop branch from 2296f83 to 88031b5 Compare February 27, 2021 15:40

mychoxin force-pushed the fix_promote_dead_loop branch from 80c890c to 9e77de1 Compare March 1, 2021 00:25

neha-ojha reviewed Mar 1, 2021

View reviewed changes

mychoxin force-pushed the fix_promote_dead_loop branch from 9e77de1 to 617f711 Compare March 1, 2021 23:55

neha-ojha added the needs-qa label Mar 2, 2021

tchaikov added the wip-kefu-testing label Mar 5, 2021

tchaikov removed the wip-kefu-testing label Mar 7, 2021

tchaikov self-assigned this Mar 7, 2021

tchaikov added the wip-kefu2-testing label Mar 8, 2021

tchaikov added the wip-kefu-testing label Mar 8, 2021

tchaikov merged commit 1b36d57 into ceph:master Mar 8, 2021

myoungwon mentioned this pull request Mar 22, 2021

osd: avoid for the two copy to cancel each other #40067

Merged

3 tasks

Conversation

mychoxin commented Feb 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

mychoxin commented Feb 21, 2021

Uh oh!

mychoxin commented Feb 22, 2021

Uh oh!

liewegas commented Feb 22, 2021

Uh oh!

mychoxin commented Feb 23, 2021

Uh oh!

liewegas commented Feb 23, 2021

Uh oh!

mychoxin commented Feb 24, 2021

Uh oh!

mychoxin commented Feb 26, 2021

Uh oh!

liewegas left a comment

Choose a reason for hiding this comment

Uh oh!

mychoxin commented Feb 27, 2021

Uh oh!

This comment has been minimized.

mychoxin commented Feb 27, 2021

Uh oh!

tchaikov commented Feb 27, 2021

Uh oh!

neha-ojha left a comment

Choose a reason for hiding this comment

Uh oh!

mychoxin commented Mar 1, 2021

Uh oh!

athanatos commented Mar 2, 2021

Uh oh!

myoungwon commented Mar 2, 2021

Uh oh!

mychoxin commented Mar 5, 2021

Uh oh!

tchaikov commented Mar 5, 2021

Uh oh!

tchaikov commented Mar 7, 2021

Uh oh!

mychoxin commented Mar 8, 2021

Uh oh!

athanatos commented Mar 8, 2021

Uh oh!

tchaikov commented Mar 8, 2021

Uh oh!

myoungwon commented Mar 8, 2021

Uh oh!

tchaikov commented Mar 8, 2021

Uh oh!

tchaikov commented Mar 8, 2021

Uh oh!

neha-ojha commented Mar 11, 2021

Uh oh!

tchaikov commented Mar 12, 2021

Uh oh!

athanatos commented Mar 12, 2021

Uh oh!

tchaikov commented Mar 12, 2021

Uh oh!

myoungwon commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tchaikov commented Mar 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

myoungwon commented Mar 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mychoxin commented Feb 21, 2021 •

edited

Loading

myoungwon commented Mar 12, 2021 •

edited

Loading

tchaikov commented Mar 12, 2021 •

edited

Loading