Skip to content

Conversation

@href
Copy link
Contributor

@href href commented Sep 8, 2023

Introduction

With this change, it is possible to have Kubernetes annotate nodes with metadata provided by us, to see if a node is shutdown or not, and to get automatic cleanup of deleted nodes.

There is currently no documentation on how to migrate an existing cluster, nor how to properly use it on a new one. You can use helpers/run-in-test-cluster however, to spawn a cluster that has the CCM installed.

Cluster Requirements

Basically two things are needed --cloud-provider=external signals to the kubelet that it should taint nodes as uninitialized on startup, and to query a cloud controller manager about them first. Secondly, you need to install the cloud controller manager using the manifest in the deploy directory.

I'll add documentation about this in a later step, when load balancers have been implemented as well. Migration would have to be tested separately anyway, and maybe cross-checked with our customers.

Automated Testing

I integrated linting, unit tests, and integration tests. They are run whenever a new change is pushed. I rely more heavily on unit tests, as integration tests are slow and somewhat cumbersome.

Table-Based Testing

I also changed my stance on "table-based" testing. I've been using the following approach in my own code, and found that others do the same:

func TestSomething(t *testing.T) {
    assertSomething := func(input string) { }

    assertSomething("foo")
    assertSomething("bar")
}

This better fits my brain (I like to write the test method before the test parameters), it allows to set break-points for specific parameters (cumbersome with tables), and works the same for static lists of parameters, and dynamically generated ones (i.e. by looping over something).

Example of this being used elsewhere:
https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/lib/logstorage/parser_test.go

I might have discussed this with you beforehand, but in your absence I decided to go ahead with this approach. I hope that is okay for you.

href added 9 commits August 22, 2023 09:37
Without this option, the instance metadata updates generally don't
work. Kubernetes still seems to ask our endpoints for these, but it
does not apply them anymore.

Our customers will have to use `--cloud-provider=external`, which has
the effect of tainging new nodes as uninitialized, until our CCM
confirms that the node exists.

It's technically possible to have multiple CCMs in a single cluster
(multi-cloud), but it's neither well supported nor tested by anyone
from what I can tell.
This is not automatically tested yet, but it implements everything
we need to provide to have InstancesV2 work as expected in a
test-cluster:

* The nodes are initialized once the CCM is ready.
* Node metadata is updated after changes.
* When a server is shut down, Kubernetes learns about it.
* When a server is deleted, Kubernetes deletes the node.
@href href requested a review from alakae September 8, 2023 07:30
@href href self-assigned this Sep 8, 2023
Copy link
Contributor

@alakae alakae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your hard work! I have only added a small number of minor comments.

About the table-based testing: I would not say it suites my taste and I assume it makes writing machine-readable test result files that can report the outcome of the test cases individually impossible?

But as long as we are consistent within one repository, your solution is good for me.

timeout time.Duration

// CCM endpoints
instances *instances
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instances_v2? But also: I don't care.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it would have to be instancesV2, given the naming guidelines of Go. Since I know of other CCMs that use instances as well I think I leave it as is. But no strong stance either on my end.

func (i *instances) InstanceMetadata(ctx context.Context, node *v1.Node) (
*cloudprovider.InstanceMetadata, error) {

server, err := ensureOne(i.serverMapper.byNode(ctx, node))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not move the logic of ensureOne to serverMapper? Then here you could just call:
i.serverMapper.byNodeSingle and byNodeSingle could call byNode?

to me this looks cleaner.

@href
Copy link
Contributor Author

href commented Sep 19, 2023

About the table-based testing: I would not say it suites my taste and I assume it makes writing machine-readable test result files that can report the outcome of the test cases individually impossible?

Actually, you can. It works exactly like table-driven tests. You use a t.Run call to create the subtest that can then be targeted individually. For example:

	assertMatch := func(name string, node *v1.Node) {
		t.Run(name, func(t *testing.T) {
			match, err := mapper.byNode(context.Background(), node)
			assert.NoError(t, err)
			assert.Equal(t, name, match.Name)
		})
	}

I haven't done this as I haven't really seen any need for it, but it could be done.

@alakae
Copy link
Contributor

alakae commented Sep 19, 2023

okay.. thanks for letting me know.. this is interesting :)

The previous approach was used to avoid having to mock HTTP responses,
but in practice this turned out to be more cumbersome.
@href href merged commit e815b10 into main Sep 25, 2023
@href href deleted the denis/instancesv2 branch January 5, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants