fix: Resolve transient kueue webhook failures#5549
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses transient failures occurring during the Kueue webhook installation process. By introducing a mandatory wait period after the installation module, the configuration phase is delayed, allowing the webhook service sufficient time to initialize and stabilize, thereby preventing race conditions. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds a 60-second wait period using the time_sleep resource to allow the Kueue webhook to become ready before configuration. It also updates the module's dependencies and documentation to include the time provider. A review comment suggests applying a conditional count to the time_sleep resource so that the delay is only triggered when Kueue is actually being installed, avoiding unnecessary wait times for other configurations.
ed03b6d to
8067f85
Compare
8067f85 to
5646192
Compare
ea18149
into
GoogleCloudPlatform:develop
This PR resolves the transient kueue webhook failure where a race condition frequently occurs when Terraform attempts to apply Kueue custom resources (such as ResourceFlavor or ClusterQueue) using kueue config file before the kueue webhook is fully provisioned.
This happens because Kueue requires a few extra seconds to fully boot its internal systems in the background.
Submission Checklist
NOTE: Community submissions can take up to 2 weeks to be reviewed.
Please take the following actions before submitting this pull request.