[4.0] memcached: Make config non-HA-aware (bsc#1038223)#1340
[4.0] memcached: Make config non-HA-aware (bsc#1038223)#1340cmurphy wants to merge 1 commit intocrowbar:stable/4.0from
Conversation
|
Cloud8 version is here: #1341 (not cherry-picked) |
nicolasbock
left a comment
There was a problem hiding this comment.
If I understood this correctly then we won't use several memcached instances within HA but only the on-node instance. That is giving up on quite some potential performance, isn't it? This sounds like a pretty fundamental limitation of the oslo cache code.
|
@nicolasbock what performance gains did we get from using multiple cache instances? I can't find information on how configuring memcached in a cluster improves performance, and in fact from https://github.com/memcached/memcached/wiki/Performance#maximum-number-of-nodes-in-a-cluster there is the potential for it to impede performance. The issue is actually in python-memcached, not oslo.cache: https://github.com/linsomniac/python-memcached/blob/1.58/memcache.py#L444-L448 |
In a multiprocess keystone setup in which all keystone instances access the same pool of memcached servers I would expect a potential increase in keystone performance because of the shared cache. While you are correct to point out that a It sounds like the memcached design does not consider failed nodes though (I couldn't find anything in their documentation) and waiting for a connection to time out definitely decreases performance. Since you found that we can't tune anything to change that behavior I agree that it's better to simply use the local memcached only. Thanks for the additional details! |
|
@cmurphy there were two memcached_servers.join in the nova config file |
159d168 to
5562d2f
Compare
|
@stefannica thanks, fixed |
Without this patch, the keystone and nova barclamps set their cache servers to all of the memcached servers in the cluster in lexicographical order. This is not actually an optimal way to configure memcached servers since if part of the cluster is down, the memcached servers living on it will be inaccessible. The python-memcached backend is not tied to pacemaker and has no way of knowing that the server is down, so it attempts to connect to each server serially, not attempting the next one until the first times out. The effect is that any query to the OpenStack service will take a very long time. This patch fixes the issue by only using the local memcached server for keystonemiddleware instead of using all in the cluster. This means every controller in the cluster will use only its own memcached server, similar to how it would work if it was using an in-process cache.
5562d2f to
764c1b9
Compare
|
I'm pretty sure the failure here must be caused by this change |
|
@nicolasbock after doing a lot of reading I understand better what you were saying about performance gains - if we have two separate caches then each controller has to cache everything itself, doubling or tripling the number of writes we have to do. Not ideal. The HA job is failing here because when the ceph cookbook tries to make a role assignment, it makes two requests, one to GET the role assignments for the ceph user and one to PUT the new role assignment. It was lucky that it was fairly consistent about which controller ended up receiving the requests. One GET for role assignments would go to one controller and produce a cache hit containing just the role member, which comes from the ceph user having a default tenant set, and which is not the intended assignment of 'admin'. It would then issue a PUT to try to correct the role assignment and fail with an HTTP 409 because it had already created this role assignment in an earlier chef run, just the request had gone to a different controller and therefore was only cached on that controller. I think this particular issue could be corrected by using the keystone v3 API for role assignments (which we already do in master) which wouldn't consider a default project to be a role assignment and would therefore have a cache miss and seek the role assignments from the database. But this illustrates the potential for a sort of split-brain problem that is not really acceptable, in addition to the performance hit. I commented on the bug that I think the problem that prompted this is not really a problem any more since we switched to the memcache_pool backend, so closing this. |
|
That's interesting @cmurphy. Nice analysis! |
Without this patch, the keystone and nova barclamps set their cache
servers to all of the memcached servers in the cluster in
lexicographical order. This is not actually an optimal way to configure
memcached servers since if part of the cluster is down, the memcached
servers living on it will be inaccessible. The python-memcached backend
is not tied to pacemaker and has no way of knowing that the server is
down, so it attempts to connect to each server serially, not attempting
the next one until the first times out. The effect is that any query to
the OpenStack service will take a very long time. This patch fixes the
issue by only using the local memcached server for keystonemiddleware
instead of using all in the cluster. This means every controller in the
cluster will use only its own memcached server, similar to how it would
work if it was using an in-process cache.