improved documentations and increased test coverage for init container#193
improved documentations and increased test coverage for init container#193gflarity wants to merge 2 commits into
Conversation
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
0c69575 to
ce9847c
Compare
Signed-off-by: Geoff Flarity <gflarity@nvidia.com>
|
Tagged @renormalize as looks like you coded the init container dir. |
| "github.com/NVIDIA/grove/operator/internal/version" | ||
| ) | ||
|
|
||
| // log is the global logger instance configured for the grove init container. |
There was a problem hiding this comment.
This comment is not required.
| // TestSetupSignalHandler tests the signal handling setup and context cancellation behavior. | ||
| // It validates proper signal registration, context cancellation, and graceful shutdown behavior. | ||
| func TestSetupSignalHandler(t *testing.T) { | ||
| tests := []struct { |
There was a problem hiding this comment.
lets avoid creating table driven tests when the test function just has a single test.
| {}, | ||
| } | ||
|
|
||
| for i := range tests { |
There was a problem hiding this comment.
this test is not really useful as it does not really test the signal handler.
There was a problem hiding this comment.
Perhaps the signature of setupSignalHandler could be changed to accept the channel as a parameter, which would enable us in unit testing this function correctly. Right now, the test just establishes things like an error hasn't occured when a context is created without its channel being closed, which will always be the case.
However, I think we can delegate testing graceful termination of the init-container (a consquence of the signal handler) to an end-to-end test.
There was a problem hiding this comment.
Not exactly sure how to adjust the prompt here. Please let me know if you have any suggestions.
|
|
||
| // TestSetupSignalHandlerContextProperties tests the properties of the returned context. | ||
| // It validates that the context behaves correctly before any signals are sent. | ||
| func TestSetupSignalHandlerContextProperties(t *testing.T) { |
There was a problem hiding this comment.
this test is not really useful as it does not really test the signal handler.
| ) | ||
|
|
||
| // CLIOptions defines the configuration that is passed to the init container. | ||
| // It contains the PodClique dependencies with their minimum available replica requirements. |
There was a problem hiding this comment.
this line is not required as it is also repeated at the field level
|
|
||
| // WaitForReadyWithClient waits for all parent PodCliques to reach their minimum ready replica count | ||
| // using the provided Kubernetes client. This enables testing by allowing client injection. | ||
| func (c *ParentPodCliqueDependencies) WaitForReadyWithClient(ctx context.Context, client kubernetes.Interface, log logr.Logger) error { |
There was a problem hiding this comment.
instead of creating this function, create the client when constructing ParentPodCliqueDependencies and make it part of the struct itself.
| } | ||
|
|
||
| // createClient creates and returns a Kubernetes clientset using the in-cluster configuration. | ||
| // This function is designed to be called from within a pod running in a Kubernetes cluster. |
There was a problem hiding this comment.
Redundant comment line as in-cluster configuration automatically means its going to be running withing a Pod. Also no code runs outside a Pod.
| // createClient creates and returns a Kubernetes clientset using the in-cluster configuration. | ||
| // This function is designed to be called from within a pod running in a Kubernetes cluster. | ||
| func createClient() (*kubernetes.Clientset, error) { | ||
| // Get the in-cluster REST configuration |
There was a problem hiding this comment.
no need for this comment. When code is simple and self-documentation additional documentation on top is just noise.
| ) | ||
| } | ||
|
|
||
| // Create the Kubernetes clientset from the REST configuration |
There was a problem hiding this comment.
same remove such comments as they serve no purpose.
| // registerEventHandler registers pod lifecycle event handlers with the shared informer factory. | ||
| // It handles pod addition, updates, and deletion events to track readiness state changes. | ||
| func (c *ParentPodCliqueDependencies) registerEventHandler(factory informers.SharedInformerFactory, log logr.Logger) error { | ||
| // Get the pod informer from the factory |
renormalize
left a comment
There was a problem hiding this comment.
Thank you for raising a much needed PR, @gflarity!
Reading through the test cases made me find a couple hidden parsing bugs, since the test cases were written such that these bugs were considered as correct behavior.
I've added comments which correct the test cases, and suggest the fixes in the code.
I've also found a few tests to be redundant, which I've suggested the removal of. We can keep them if you wish to.
| log = logger.MustNewLogger(false, configv1alpha1.InfoLevel, configv1alpha1.LogFormatJSON).WithName("grove-initc") | ||
| ) | ||
|
|
||
| // main is the entry point for the grove init container. |
There was a problem hiding this comment.
| // main is the entry point for the grove init container. |
| ) | ||
|
|
||
| // main is the entry point for the grove init container. | ||
| // It parses CLI options, sets up signal handling, and waits for parent PodCliques to be ready. |
There was a problem hiding this comment.
| // It parses CLI options, sets up signal handling, and waits for parent PodCliques to be ready. | |
| // Parse CLI options, set up signal handling, and wait for parent PodCliques to be ready. |
| {}, | ||
| } | ||
|
|
||
| for i := range tests { |
There was a problem hiding this comment.
Perhaps the signature of setupSignalHandler could be changed to accept the channel as a parameter, which would enable us in unit testing this function correctly. Right now, the test just establishes things like an error hasn't occured when a context is created without its channel being closed, which will always be the case.
However, I think we can delegate testing graceful termination of the init-container (a consquence of the signal handler) to an end-to-end test.
| }, | ||
| // Valid with whitespace around values | ||
| { | ||
| input: []string{"podclique-whitespace:2"}, |
There was a problem hiding this comment.
I don't think the white space has been added to this test input?
| input: []string{"podclique-whitespace:2"}, | |
| input: []string{"podclique-whitespace : 2"}, |
Correcting this test case has caught one potential bug:
replicas, err := strconv.Atoi(nameAndMinAvailable[1])
does not trim the whitespace for the subtring containing the replicas.
replicas, err := strconv.Atoi(strings.TrimSpace(nameAndMinAvailable[1]))
| } | ||
|
|
||
| // Parse the replica count as an integer | ||
| replicas, err := strconv.Atoi(nameAndMinAvailable[1]) |
There was a problem hiding this comment.
Context: #193 (comment)
| replicas, err := strconv.Atoi(nameAndMinAvailable[1]) | |
| replicas, err := strconv.Atoi(strings.TrimSpace(nameAndMinAvailable[1])) |
This hasn't been a problem till now (and potentially won't be either), since the arguments for the initc are generated in the operator. Still, it is always better to make the code more robust.
| if tt.expectCreateClientErr { | ||
| // Should error during client creation since we're not in a K8s environment | ||
| assert.Error(t, err) | ||
| assert.Contains(t, err.Error(), "unable to load in-cluster configuration") |
There was a problem hiding this comment.
Can we use the ErrNotInCluster constant in the rest package instead? This would avoid test breakage if the string is changed in the future.
| { | ||
| name: "wrapper with empty dependencies", | ||
| podCliqueDependencies: map[string]int{}, | ||
| namespaceContent: "empty-namespace", | ||
| podGangContent: "empty-podgang", | ||
| expectError: false, | ||
| }, |
There was a problem hiding this comment.
Can be removed since empty dependencies will not be a case as discussed.
|
|
||
| // TestNewPodCliqueState tests the wrapper function with real file operations. | ||
| // It validates that the wrapper correctly calls the testable version with default paths. | ||
| func TestNewPodCliqueState(t *testing.T) { |
There was a problem hiding this comment.
Not so big on tests like this, which just check the creation of a struct instance from the passed fields to the constructing function. We can keep this if you prefer checking this in.
| // Zero requirements should always be ready | ||
| { | ||
| dependencies: map[string]int{"zero-clique": 0}, | ||
| readyCounts: map[string]int{"zero-clique": 0}, | ||
| expectedReady: true, | ||
| }, | ||
| // Empty dependencies should be ready | ||
| { | ||
| dependencies: map[string]int{}, | ||
| readyCounts: map[string]int{}, | ||
| expectedReady: true, | ||
| }, |
There was a problem hiding this comment.
Testcases not needed since, as discussed in my previous comments.
| for cliqueName, readyCount := range tt.readyCounts { | ||
| podSet := sets.New[string]() | ||
| // Add dummy pod names to reach the desired count | ||
| for i := 0; i < readyCount; i++ { |
There was a problem hiding this comment.
Range loops are simpler and are the convention for [0, n) iterations.
| for i := 0; i < readyCount; i++ { | |
| for i := range readyCount { |
|
I'm going to close this PR and open separate PR for documentation and testing using updated prompts. |
What type of PR is this?
/kind documentation
/kind tests
What this PR does / why we need it:
As discussed during our sync, I've used Cursor/Claude to:
I'll include the prompts below for those who are curious. Please feel free to just suggest changes to anything that doesn't look right (ideally just make your changes using the suggest changes github UI).
I realize the documentation is a bit verbose but I think it's ultimately a win as long as it's accurate. It helps new users get up to speed with the codebase, improving documentation also helps the models generate better tests/code when it's well documented.
If you'd like me to take another stab at a particular test or test case, just let me know. I've read through them but I'm just getting acquainted with the code base so it's best to get more eyes on it.
Special notes for your reviewer:
Prompts used:
Documentation:
Improve the inline documentation for for all golang files under this directory/package. Use concise idiomatic go documentation principles. Ensure correctness. Please add some minimal inline documentation above code blocks when to explain the intent of the code. Don't change the code or variable names. All functions to should have concise accurate documentation. Blocks of code should have a comment describing them if practical.
Tests:
Please create an idiomatic golang unit tests for this specific directory/package. Please be sure to document the fields in the anonymous test structs and how they'll be used below. Each Test* function should also have some concise documentation as well. Each test case should have a brief explanation and expectation explained. Rather than having a description field, the description should just be in comments above the name in the struct declaration. Don't forget to run the tests and confirm they're working, but focus on just testing this package as others might have issues. Please avoid creating skipped tests. Feel free to refactor some code to make it more testable. Specifically hardcoded constants etc are good candidates to be refactor out of testable code. As are interface that can be mocked in a straight forward way. Please ensure backwards compatibility and new functions rather than changing the signature of public functions. Keep these refactors to a minimum though, so only do it where there's a good return on investment.
Test Quality And Coverage Check:
Please review all the tests for this package/directory. Do they make sense? Is coverage good? Are all tests and test cases well documented?
Does this PR introduce a API change?
NONE
Additional documentation e.g., enhancement proposals, usage docs, etc.: