rbd: When Ceph cluster becomes full, should allow user to remove rbd …#12627
rbd: When Ceph cluster becomes full, should allow user to remove rbd …#12627yuriw merged 3 commits intoceph:masterfrom liupan1111:wip-fix-remove-when-full
Conversation
|
At this moment, the behavior is "hang the client". |
src/include/rados/librados.hpp
Outdated
| int get_object_pg_hash_position2(const std::string& oid, uint32_t *pg_hash_position); | ||
|
|
||
| config_t cct(); | ||
| IoCtxImpl *get_impl() {return io_ctx_impl;} |
There was a problem hiding this comment.
I don't think it's a good idea to expose implementation to user. Maybe adding an interface for what you want is better.
There was a problem hiding this comment.
agree, will modify later, thanks.
src/tools/rbd/Utils.h
Outdated
|
|
||
| int init(const std::string &pool_name, librados::Rados *rados, | ||
| librados::IoCtx *io_ctx); | ||
| librados::IoCtx *io_ctx, bool force = false); |
There was a problem hiding this comment.
The parameter name of force doesn't really align (name-wise) with what it is actually doing behind the scenes which is to force-apply a special flag to the OSDs. Also, assuming there are no issues w/ the librados changes by others, I don't think this is the correct place to do this. Seems like you should be able to set this flag after opening the image for the remove actions so you don't need to put it in this common location.
There was a problem hiding this comment.
@dillaman, Thank you for The review! I agree this change should better not be placed in Such a common place like init. I will modify it.
|
@dillaman, I've commit a new one, please help take a look. |
|
The rbd changes look good to me -- a member of the core team needs to evaluate the changes to librados and the OSDs. |
src/osd/PrimaryLogPG.cc
Outdated
| } | ||
| if (!(m->get_source().is_mds()) && osd->check_failsafe_full() && write_ordered) { | ||
| if (!(m->get_source().is_mds()) && osd->check_failsafe_full() && | ||
| !m->has_flag(CEPH_OSD_FLAG_FULL_FORCE) && write_ordered) { |
There was a problem hiding this comment.
I don't think we want to touch the failsafe condition!
There was a problem hiding this comment.
The ops of deleting an image will be discarded here if don’t touch the failsafe condition
|
|
||
| void set_honor_osdmap_full(); | ||
| void unset_honor_osdmap_full(); | ||
|
|
There was a problem hiding this comment.
Instead of exposing this, you can use the existing librados op flag:
src/include/rados/librados.hpp: OPERATION_FULL_TRY = LIBRADOS_OPERATION_FULL_TRY,
FULL_TRY makes the OSD assemble the transaction and proceed only if it does not result in a net increase in utilization. This should be true for deleting objects.
There's also a FORCE variant, but I don't think you should need it...
There was a problem hiding this comment.
In “Objecter::_prepare_osd_op”, “CEPH_OSD_FLAG_FULL_FORCE” is set before sent op to an OSD, if “honor_osdmap_full” is false. When removing an image, several ops, like open, delete, watch, will be sent to the OSD, it’s difficult to set FULL_TRY flag for these ops one by one.
There was a problem hiding this comment.
Hmm, in that case, the FULL_TRY flag can be set in the same place as the FULL_FORCE flag: Objecter::_prepare_osd_op(). So you can add {set,unset}_osdmap_full_try(), and set the flag in one place, and then use that.
I really think we should use FULL_TRY unless we find there is some reason that can't work... if that's the case we have a bigger problem!
There was a problem hiding this comment.
It would still be far preferably to use the existing librados flags, though! Please try that appraoch first.. hopefully there is a central place wehre that can be done, or add an ImageCtx 'rados_flags' variable that is |'d in to every request.
|
Right, but you shouldn't be reaching the failsafe condition. This is
usually a much higher threshold than the osdmap cluster FULL flag that
stops writes.. it's there to prevent some other process (errant backfill
or something) from filling up an OSD. In theory it should never be
triggered.
|
|
In that case, let's do one patch that rneames honor_osdmap_full to
set_full_force() (with opposite argument... false by default, true for
enable), and then another patch that adds a new set_full_try().
|
@liewegas, do you mean I need rename honor_osdmap_full or set_honor_osdmap_full()? |
|
@liewegas, this failsafe is not triggered by my modification... it will be called even without my changes... |
|
@liewegas Ping |
|
@liewegas, Could you help take a look? thanks. |
|
@liupan1111 I don't see any changes. I thought Sage was suggesting that you re-use the existing librados flag for allowing ops to proceed on full(?). Assuming that is the case, I think you would want to modify librbd to set the flag where needed when opening the image and set the flag on the individual ops issued on the remove / snap remove paths. |
src/tools/rbd/Utils.cc
Outdated
|
|
||
| int init(const std::string &pool_name, librados::Rados *rados, | ||
| librados::IoCtx *io_ctx) { | ||
| librados::IoCtx *io_ctx, bool force) { |
There was a problem hiding this comment.
Like @dillaman said, this should also be a more descriptive name... bool force_full_try probably
There was a problem hiding this comment.
yes, I will modify it.
Signed-off-by: Pan Liu <liupan1111@gmail.com>
Signed-off-by: Pan Liu <liupan1111@gmail.com>
…full. Signed-off-by: Pan Liu <liupan1111@gmail.com>
|
RBD TestLibRBD.FlattenNoEmptyObjects failures unrelated, also on master per @jasondillaman RADOS http://pulpito.ceph.com/yuriw-2017-03-02_01:20:35-rados-wip-yuri-testing_2017_3_2---basic-smithi/ and reproduce http://pulpito.ceph.com/yuriw-2017-03-02_01:56:48-rados-wip-yuri-testing_2017_3_2---basic-ovh/ same job passed on manual rerun ! and confirmed pass |
…and snapshot in order to release storage space.
Signed-off-by: Pan Liu pan.liu@istuary.com