Skip to content

Check for stockouts during bulkInsert in integration tests.#1880

Merged
cdunbar13 merged 1 commit into
GoogleCloudPlatform:developfrom
cdunbar13:stockout-check-fix
Oct 26, 2023
Merged

Check for stockouts during bulkInsert in integration tests.#1880
cdunbar13 merged 1 commit into
GoogleCloudPlatform:developfrom
cdunbar13:stockout-check-fix

Conversation

@cdunbar13

@cdunbar13 cdunbar13 commented Oct 25, 2023

Copy link
Copy Markdown
Collaborator

After a reversion when the previous version broke tests, this PR attempts to more accurately diagnose stockouts in the output of the integration tests. Useful for the flake tool and potentially in the future retrying tests automatically.

When stock outs our found, it should look similar to (Error message and instance ids may vary):

Step #2 - "hcls": TASK [Check for stockout errors] ***********************************************
Step #2 - "hcls": included: /workspace/tools/cloud-build/daily-tests/ansible_playbooks/tasks/check_stockout.yml for 130.211.102.173
Step #2 - "hcls": 
Step #2 - "hcls": TASK [Assert variables are defined] ********************************************
Step #2 - "hcls": ok: [130.211.102.173 -> localhost] => {
Step #2 - "hcls":     "changed": false
Step #2 - "hcls": }
Step #2 - "hcls": 
Step #2 - "hcls": MSG:
Step #2 - "hcls": 
Step #2 - "hcls": All assertions passed
Step #2 - "hcls": 
Step #2 - "hcls": TASK [Check logs for stockout on compute nodes] ********************************
Step #2 - "hcls": ok: [130.211.102.173 -> localhost] => (item=Region does not currently have sufficient capacity for the requested resources.)
Step #2 - "hcls": ok: [130.211.102.173 -> localhost] => (item=No eligible zone could be found in this region for given properties)
Step #2 - "hcls": 
Step #2 - "hcls": TASK [Log compute stockout error] **********************************************
Step #2 - "hcls": skipping: [130.211.102.173] => (item=Region does not currently have sufficient capacity for the requested resources.) 
Step #2 - "hcls": ok: [130.211.102.173 -> localhost] => (item=No eligible zone could be found in this region for given properties) => {}
Step #2 - "hcls": 
Step #2 - "hcls": MSG:
Step #2 - "hcls": 
Step #2 - "hcls": "Abbreviated listing of nodes that could not be created:"
Step #2 - "hcls": "INSTANCE_ID                   ERROR_MESSAGE
Step #2 - "hcls": hcls7636d5-gpu-ghpc-9  No eligible zone could be found in this region for given properties
Step #2 - "hcls": hcls7636d5-gpu-ghpc-1  No eligible zone could be found in this region for given properties
Step #2 - "hcls": hcls7636d5-gpu-ghpc-3  No eligible zone could be found in this region for given properties
Step #2 - "hcls": hcls7636d5-gpu-ghpc-5  No eligible zone could be found in this region for given properties
Step #2 - "hcls": hcls7636d5-gpu-ghpc-8  No eligible zone could be found in this region for given properties"

@cdunbar13 cdunbar13 added the release-bugfix Added to release notes under the "Bug fixes" heading. label Oct 25, 2023
@cdunbar13 cdunbar13 requested a review from nick-stroud October 25, 2023 18:44
@cdunbar13 cdunbar13 changed the title Fix stockout task issue Check for stockouts during bulkInsert in integration tests. Oct 25, 2023
@nick-stroud nick-stroud assigned cdunbar13 and unassigned nick-stroud Oct 25, 2023
@cdunbar13 cdunbar13 merged commit 7588379 into GoogleCloudPlatform:develop Oct 26, 2023
@cdunbar13 cdunbar13 deleted the stockout-check-fix branch October 26, 2023 12:44
@mr0re1 mr0re1 mentioned this pull request Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-bugfix Added to release notes under the "Bug fixes" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants