Skip to content

Backport for V0.2.12 release#387

Merged
ArangoGutierrez merged 22 commits intorelease-0.2from
main
Jun 2, 2025
Merged

Backport for V0.2.12 release#387
ArangoGutierrez merged 22 commits intorelease-0.2from
main

Conversation

@ArangoGutierrez
Copy link
Collaborator

No description provided.

ArangoGutierrez and others added 22 commits May 30, 2025 14:58
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Expand retry time out on Kubernetes templates
…init

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add --ignore-preflight-errors= to kubernetes template during kubeadm
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Bumps [github.com/aws/aws-sdk-go-v2/service/ec2](https://github.com/aws/aws-sdk-go-v2) from 1.222.0 to 1.224.0.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/changelog-template.json)
- [Commits](aws/aws-sdk-go-v2@service/ec2/v1.222.0...service/ec2/v1.224.0)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/service/ec2
  dependency-version: 1.224.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
….com/aws/aws-sdk-go-v2/service/ec2-1.224.0

Bump github.com/aws/aws-sdk-go-v2/service/ec2 from 1.222.0 to 1.224.0
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Only generate ginkgo logs on gpu test
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Normalize retry/timeouts for kubernetes installation
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Fix test data to use version instead of kubernetesVersion
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jun 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ArangoGutierrez ArangoGutierrez requested a review from Copilot June 2, 2025 15:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR backports changes for the V0.2.12 release with updates spanning test configurations, provisioning templates, dependency versions, and documentation improvements.

  • Update YAML test files for AWS configuration by renaming the "kubernetesVersion" field to "version".
  • Refactor AWS environment end-to-end tests to use table-driven patterns and adjust cleanup logic.
  • Increment version numbers and adjust retry intervals in provisioning templates and tests.
  • Update Makefile and GitHub workflows to refine test reporting and add coverage upload steps.

Reviewed Changes

Copilot reviewed 47 out of 47 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/data/test_aws_legacy.yml Renames YAML key from "kubernetesVersion" to "version" for legacy tests.
tests/data/test_aws_dra.yml Similar YAML key renaming update for DRA tests.
tests/aws_test.go Refactors AWS environment tests to a table-driven approach and cleanup.
tests/Makefile Removes the default --json-report flag from the test command.
pkg/provisioner/templates/kubernetes.go Increases retry intervals and adds an explicit wait for kube-apiserver.
pkg/provisioner/templates/containerd*.go Updates version defaults and dependency versions in template code.
pkg/provisioner/provisioner.go Lowers maxRetries and increases retryInterval for node reboot logic.
go.mod Increments EC2 service dependency version.
Docs files Adjusts links and content for improved documentation and guide clarity.
.github/workflows/* Enhances CI workflows with additional steps and matrix configurations.
Comments suppressed due to low confidence (2)

tests/Makefile:30

  • The removal of the '--json-report ginkgo.json' option may conflict with downstream workflows or artifacts expecting a JSON report; consider reintroducing it when the '--json-report' flag is specified via GINKGO_ARGS.
$(GINKGO_BIN) $(GINKGO_ARGS) -v ./tests/...

pkg/provisioner/provisioner.go:96

  • Reducing maxRetries from 30 to 10 with an increased retry interval could cause node reboot detection to be too aggressive in environments where reboot delays exceed the new thresholds; please verify that this change aligns with expected node recovery times.
maxRetries := 10

@ArangoGutierrez ArangoGutierrez merged commit aa690b0 into release-0.2 Jun 2, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants