Skip to content

kubeadm: HA documentation#12923

Closed
Klaven wants to merge 1 commit intokubernetes:dev-1.14from
Klaven:dev-1.14
Closed

kubeadm: HA documentation#12923
Klaven wants to merge 1 commit intokubernetes:dev-1.14from
Klaven:dev-1.14

Conversation

@Klaven
Copy link
Copy Markdown

@Klaven Klaven commented Mar 1, 2019

this PR will update the kubeadm reference documentation regarding HA for 1.14.

xref: kubernetes/kubeadm#1422

/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 1, 2019
@k8sio-netlify-preview-bot
Copy link
Copy Markdown
Collaborator

k8sio-netlify-preview-bot commented Mar 1, 2019

Deploy preview for kubernetes-io-vnext-staging processing.

Building with commit 0da3d32

https://app.netlify.com/sites/kubernetes-io-vnext-staging/deploys/5c87d2e3d18c47000808dfa5

@k8s-ci-robot k8s-ci-robot added language/en Issues or PRs related to English language size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 1, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: bradtopol

If they are not already assigned, you can assign the PR to them by writing /assign @bradtopol in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from jbeda and luxas March 1, 2019 15:38
@Klaven Klaven mentioned this pull request Mar 1, 2019
3 tasks
@neolit123
Copy link
Copy Markdown
Member

/sig cluster-lifecycle
/assign

@zacharysarah zacharysarah added this to the 1.14 milestone Mar 7, 2019
@neolit123
Copy link
Copy Markdown
Member

neolit123 commented Mar 8, 2019

in this PR we are going to have to document that there is no config field for the copy-certs functionality and we have to either pass everything as flags for HA join or to only use the --config .... --certificate-key ... mixture.
/cc @ereslibre

@k8s-ci-robot k8s-ci-robot requested a review from ereslibre March 8, 2019 21:03
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 12, 2019
@neolit123
Copy link
Copy Markdown
Member

@kubernetes/sig-cluster-lifecycle-pr-reviews

@Klaven Klaven changed the title [WIP] kubeadm: HA documentation placeholder kubeadm: HA documentation Mar 13, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 13, 2019
@neolit123
Copy link
Copy Markdown
Member

/assign @fabriziopandini

Copy link
Copy Markdown
Member

@yagonobre yagonobre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Klaven

cluster.
- Copy this output to a text file. You will need it later to join control plane and worker nodes to the cluster.
- When `--experimental-upload-certs` is used with `kubeadm init`, the certificates of the primary control plane
are uploaded to the cluster in a Secret and encrypted.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe are encrypted and uploaded to the kubeadm-certs secret

- It's recommended that you join new control plane nodes in a sequence, only after the first node has finished initializing.

1. Copy the certificate files from the first control plane node to the rest:
1. Optionally copy the `admin.conf` file from the first control plane node to the rest:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we should copy this config? [1]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to use kubectl on the other CP nodes.
i'm OK with removing this but waiting on review by @fabriziopandini

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The join don't generate a config?

Copy link
Copy Markdown
Member

@neolit123 neolit123 Mar 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's actually a good point...
possibly best to remove this anyway.

@Klaven please remove these lines/steps

  1. Optionally copy the admin.conf...

and

  1. Optionally move the admin.conf file created...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to copy the admin.conf file
However, I think we should still provide user instruction for manually copy certs if they want to opt out from the --experimental-upload-certs feature, so my suggestion is to move everyting related to "configure SSH" and scp certs (without the admin.conf file) in a dedicated paragraph "manual certs distribution"

### Steps for the rest of the control plane nodes

1. Move the files created by the previous step where `scp` was used:
1. Optionally move the `admin.conf` file created by the previous step where `scp` was used:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[1]

@neolit123
Copy link
Copy Markdown
Member

/priority important-soon

@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Mar 14, 2019
@jimangel jimangel force-pushed the dev-1.14 branch 2 times, most recently from 2534806 to ead0a28 Compare March 14, 2019 03:47
Copy link
Copy Markdown
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Klaven great to see this work!

I think we can simplify the doc by moving "Configure SSH", and all the manual copy instruction under a separated paragraph "manual certs distribution".
In other words, the main workflow described should focus on using automatic copy, but there should be a note "if instead you want to copy certs manually/using automation tools, please refer to "manual copy certs" paragraph.

Additionally, if you don't mind, there are also following possible improvements to this doc:

  • L22 (before your change) Change "Your clusters must run Kubernetes version 1.12 or later." into "This document refers to kubeadm version 1.14; similar documents are available for preview version as well"
  • L60 (before your change) Remove "{{< note >}} The following examples run Calico ..." Not true anymore
  • L71 (before your change) Remove "{{< note >}} {{< note >}} All commands ...". It seems out of context here and from a quick check there is already sudo before all command
  • L163 (before your change) Remove apiServer field and nested CertsSANs list (not necessary)
  • L295 (before your change) Same (remove apiServer/CertsSANs)
  • L343 (before your change) Remove Install a pod network, it is already part of the instructions above (eventually the content of the paragraph con be moved above)

sudo kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs
```

- The `--experimental-upload-certs` flags is used to upload the control plane
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about:
"The --experimental-upload-certs flags is used to upload the certificates that should be shared across all the control-plane instances to the cluster; If instead, you prefer to copy certs across control-plane nodes manually or using automation tools, please remove this flag and refer to "manual certs distribution" paragraph"

- Copy this output to a text file. You will need it later to join control plane and worker nodes to the cluster.
- When `--experimental-upload-certs` is used with `kubeadm init`, the certificates of the primary control plane
are uploaded to the cluster in a Secret and encrypted.
- When joining new control plane nodes using `kubeadm join ... --experimental-control-plane`,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this point after the kubeadm join --experimental-control-plane command (L256 before your changes)

```

- It's recommended that you join new control plane nodes only after the first node has finished initializing.
- It's recommended that you join new control plane nodes in a sequence, only after the first node has finished initializing.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this point before/after the kubeadm join --experimental-control-plane command (L256 before your changes)

- It's recommended that you join new control plane nodes in a sequence, only after the first node has finished initializing.

1. Copy the certificate files from the first control plane node to the rest:
1. Optionally copy the `admin.conf` file from the first control plane node to the rest:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to copy the admin.conf file
However, I think we should still provide user instruction for manually copy certs if they want to opt out from the --experimental-upload-certs feature, so my suggestion is to move everyting related to "configure SSH" and scp certs (without the admin.conf file) in a dedicated paragraph "manual certs distribution"

- `ETCD_2_IP`

1. Run `kubeadm init --config kubeadm-config.yaml` on this node.
1. Run `kubeadm init --config kubeadm-config.yaml --experimental-upload-certs` on this node.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yagonobre I don't remember if certificates for the external etcd are copied automatically or not...
If not, we should add a note that the user must take care of distributing those certs on all control-plane nodes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's upload, we only skip certs upload in case of external ca.

...
You can now join any number of machines by running the following on each node
as root:
You can now join any number of control-plane node running the following command on each as a root:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can now join any number of control-plane node running the following command on each as a root:
You can now join any number of control-plane nodes by running the following command on each as a root:

Copy link
Copy Markdown
Member

@neolit123 neolit123 Mar 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a CLI typo, can't fix here.
edit: well we can, but we need to fix in kubeadm too.

kubeadm join 192.168.0.200:6443 --token 9vr73a.a8uxyaju799qwdjv --discovery-token-ca-cert-hash sha256:7c2e69131a36ae2a042a339b33381c6d0d43887e2de83720eff5359e26aec866 --experimental-control-plane --certificate-key f8902e114ef118304e561c3ecd4d0b543adc226b7a07f675f56564185ffe0c07

kubeadm join 192.168.0.200:6443 --token j04n3m.octy8zely83cy2ts --discovery-token-ca-cert-hash sha256:84938d2a22203a8e56a787ec0c6ddad7bc7dbd52ebabc62fd5f4dbea72b14d1f
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe highlight this a bit more by wrapping it in note tags

```

- It's recommended that you join new control plane nodes only after the first node has finished initializing.
- It's recommended that you join new control plane nodes in a sequence, only after the first node has finished initializing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- It's recommended that you join new control plane nodes in a sequence, only after the first node has finished initializing.
- It's recommended that you join new control plane nodes sequentially, only after the first node has finished initializing.

@neolit123
Copy link
Copy Markdown
Member

/close
in favor of #13191

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@neolit123: Closed this PR.

Details

In response to this:

/close
in favor of #13191

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants