mgr/cephadm: use real IP as host.addr whenever possible#41483
Merged
liewegas merged 7 commits intoceph:masterfrom May 27, 2021
Merged
mgr/cephadm: use real IP as host.addr whenever possible#41483liewegas merged 7 commits intoceph:masterfrom
liewegas merged 7 commits intoceph:masterfrom
Conversation
1844b1b to
64dcd5f
Compare
sebastian-philipp
suggested changes
May 25, 2021
3 tasks
471cfc9 to
8ae97e6
Compare
Member
Author
|
jenkins test api |
We prefer to always have a real IP for hosts in the cluster. This avoids a reliance on DNS for most operations. Perhaps more importantly, it means we are less sensitive to inconsistent host lookup results, for example due to (1) mismatched /etc/hosts files between machines, or (2) a lookup of the local hostname that returns 127.0.1.1. Adjust with_hosts() fixture to take an addr, and adjust tests accordingly. Signed-off-by: Sage Weil <sage@newdream.net>
Contributor
If the host IP/addr is known, use that. The addr might even be a FQDN instead of an IP address, in which case we want to look that up instead of the bare hostname. Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
- Use a centralized method get_mgr_ip() - Look up the hostname via DNS. This is a bit more reliable than getfqdn() since it will work even when podman adds the container name to /etc/hosts. Signed-off-by: Sage Weil <sage@newdream.net>
Previously we allowed the host.addr to be a DNS name (short or fqdn). This is problematic because of the inconsistent way that docker and podman handle /etc/hosts, and undesirable because relying on external DNS is an external source of failure for the cluster without any benefit in return (simply updating DNS is not sufficient to make ceph behave). So: update any non-IP to an IP as soon as we start up (presumably on upgrade). If we get a loopback address (127.0.0.1 or 127.0.1.1), then wait and hope that the next instance of the manager has better luck. Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
This reverts cfc1f91, which is no longer neceesary because (1) we don't use socket.getfqdn(), and (2) we generally do not rely on DNS or /etc/hosts at all anymore (with the exception of the upgrade transition). Signed-off-by: Sage Weil <sage@newdream.net>
Contributor
|
follow-up #42793 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Any reliance on DNS or /etc/hosts is fraught:
Instead, use our own known IP addresses wherever possible.
NOTE: This PR does not remove the
--no-hostsflag to podman because without it podman insists on adding an entry for the hostname and container name with 127.0.0.1. Mostly that doesn't matter, except that other modules may--and mgr/nfs does--useresolve_ipto look up the IP of a hostname. Docker does not fuss with /etc/hosts by default so this is not a problem. Unfortunately, there is no way to make podman behave the same way--pass the host's /etc/hosts through unmolested without adding magic entries to it. This may have some effect on an upgraded octopus cephadm cluster that relied heavily on /etc/hosts, but that should be a very rare case, and such clusters are already broken with the current pacific release.TODO: