Skip to content

memcached: Make config non-HA-aware (bsc#1038223)#1341

Closed
cmurphy wants to merge 1 commit intocrowbar:masterfrom
cmurphy:fix-memcached
Closed

memcached: Make config non-HA-aware (bsc#1038223)#1341
cmurphy wants to merge 1 commit intocrowbar:masterfrom
cmurphy:fix-memcached

Conversation

@cmurphy
Copy link
Contributor

@cmurphy cmurphy commented Oct 2, 2017

Without this patch, when deployed in an HA configuration, all of the
barclamps set their cache servers to all of the memcached servers in the
cluster in lexicographical order. This is not actually an optimal way to
configure memcached servers since if part of the cluster is down, the
memcached servers living on it will be inaccessible. python-memcached is
not tied to pacemaker and has no way of knowing that, so it passes the
entire list to python-memcached which attempts to connect to each server
serially, not attempting the next one until the first times out. The
effect is that any query to the OpenStack service will take a very long
time if the first memcached server in the list is down.

This patch fixes the issue by only using the local memcached server
instead of using all in the cluster. This also adjusts the
get_memcached_servers helper method to only accept one node as input
since, knowing what we now know, we're unlikely to need more than one.
The get_memcached_servers method was implemented while updating the
barclamps to prevent deprecation warnings emitted by keystonemiddleware
in Ocata[1] and was mimicking old behavior used to set the cache servers
for keystone and nova.

[1] https://docs.openstack.org/releasenotes/keystonemiddleware/ocata.html

@cmurphy
Copy link
Contributor Author

cmurphy commented Oct 2, 2017

Cloud7 version is here: #1340 (not cherry-picked)

stefannica
stefannica previously approved these changes Oct 2, 2017
nicolasbock
nicolasbock previously approved these changes Oct 2, 2017
@dirkmueller
Copy link
Contributor

NoMethodError: undefined method `get_memcached_servers' for MemcachedHelper:Module

we need some other patch elsewhere?

where is the code that sets up memchached replication? basically when starting the memcached we need to tell it which one is the replication master and how to reach it.

@cmurphy cmurphy dismissed stale reviews from nicolasbock and stefannica via a16aa9a October 4, 2017 14:40
@cmurphy
Copy link
Contributor Author

cmurphy commented Oct 4, 2017

@dirkmueller I had forgotten that the swift barclamp was still using that method name.

I decided that since swift is special I would rather leave it alone, so I changed the method name back to plural and reverted the variable name changes, so this patch is a lot smaller now.

Without this patch, when deployed in an HA configuration, all of the
barclamps set their cache servers to all of the memcached servers in the
cluster in lexicographical order. This is not actually an optimal way to
configure memcached servers since if part of the cluster is down, the
memcached servers living on it will be inaccessible. python-memcached is
not tied to pacemaker and has no way of knowing that, so it passes the
entire list to python-memcached which attempts to connect to each server
serially, not attempting the next one until the first times out. The
effect is that any query to the OpenStack service will take a very long
time if the first memcached server in the list is down.

This patch fixes the issue by only using the local memcached server
instead of using all in the cluster. This is done for all barclamps
using memcached except for swift, since swift has its own way of doing
HA without pacemaker and also implements its own memcached client, so we
might as well leave it alone.
@cmurphy
Copy link
Contributor Author

cmurphy commented Oct 12, 2017

Closing for the reasons given here: #1340 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants