roachprod: azure GetVMSpecs() implementation#154467
roachprod: azure GetVMSpecs() implementation#154467craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
willchoe-1759266579-01-n1cpu4-0001.json |
da5b3e4 to
6b3dedb
Compare
|
err i have no idea why i'm getting diffs in |
DarrylWong
left a comment
There was a problem hiding this comment.
Lets change it to informs instead of resolves, since the issue is still outstanding for ibm
| }) | ||
|
|
||
| // Manual test for verifying framework behavior in a test success scenario | ||
| r.Add(registry.TestSpec{ |
There was a problem hiding this comment.
nit: it's good that you're testing this e2e but it seems generic enough you don't really need to commit it. Running any roachtest with --always-collect-artifacts on azure would do the trick as well.
There was a problem hiding this comment.
I just wanted an easy / basically "no op" test to use when running things E2E. I was using one of the http validation tests before but thought this would make my workflow simpler. I'd like to leave this in because I feel like i'd resuse this again, but if you feel strongly about it i can take it out. Also if there's a way to run the framework in a "test no op" mode, that works for me too
There was a problem hiding this comment.
but if you feel strongly about it i can take it out
Just a nit, I don't feel that strongly about it if you think it improves your workflow.
Also if there's a way to run the framework in a "test no op" mode
I usually just run one of the acceptance tests. Something like acceptance/build-info takes about 5 seconds longer to run than a noop test and goes through spinning up a CRDB cluster; doesn't really matter for your change here but it might catch a bug in other changes.
| // disableIssue disables posting GitHub issues for test failures. | ||
| disableIssue bool | ||
| // dryRunIssuePosting enables dry-run mode for GitHub issue posting. | ||
| dryRunIssuePosting bool |
There was a problem hiding this comment.
I think this needs a rebase off of master?
pkg/roachprod/vm/azure/azure.go
Outdated
| l *logger.Logger, vms vm.List, | ||
| ) (map[string]map[string]interface{}, error) { | ||
| return nil, nil | ||
| // Boiler Plate |
There was a problem hiding this comment.
nit: kind of a redundant comment, if you see an opportunity to reduce the boiler plate then go for it?
There was a problem hiding this comment.
o oops that was also from my notes, i'll take out the comment forgot to clean it up, as for the boiler plate, it's 2 function calls to setup the client, i think it's fine it just looked verbose to me at first but i think it's just go being go
pkg/roachprod/vm/azure/azure.go
Outdated
| azureVmId, err := parseAzureID(vmInstance.ProviderID) | ||
| if err != nil { | ||
| return nil, err | ||
| } else if azureVmId.resourceGroup == "" { |
There was a problem hiding this comment.
Based on other usages of parseAzureID it seems safe to assume this won't be empty? If not then it makes more sense to move the empty string checking inside parseAzureID itself.
There was a problem hiding this comment.
did some research, those fields are all required for a valid Azure Resource ID so yea it makes sense to do the validation in the struct constructor
There was a problem hiding this comment.
I'm looking the regex:
/subscriptions/([^/]+)/resourceGroups/([^/]+)/providers/([^/]+)/([^/]+)/([^/]+)(?:/(.*))?)`
And it seems like all the fields are guaranteed to be non-empty (assuming FindStringSubmatch doesn't return nil) except for the final child resource group. So we just need to check if that one is empty or not.
That being said, I'm not really sure what the child resource type does? Do we need it? Or can we just leave a comment explaining why we drop it?
There was a problem hiding this comment.
btw that code is outdated, I took out the AzureID conditional checks from GetVMSpecs and all the error handling regarding the AzureID happens in parseAzureID now
And it seems like all the fields are guaranteed to be non-empty ... except for the final child resource group
Yes, that is my understanding too
So we just need to check if that one is empty or not
Check the final child resource group / the last matching group of the regex? Not currently doing a check if it's present, but just saving it to azureID.childResourceTypeNamePairs, and assuming for now a future caller can check if it's present or not
type azureID struct {
// childResourceTypeNamePairs does not have its name pairs parsed into
// struct fields because if present, the number of pairs is ambiguous
childResourceTypeNamePairs string
provider string
resourceGroup string
resourceName string
resourceType string
subscription string
}
If found, the child resource type name pairs will be in the childResourceTypeNamePairs field, but currently I'm not using it in GetVMSpecs nor is it used anywhere else afaik. I only added this for "correctness" because previously the child resource would've just been appended to the resourceName field i.e.
resourceName = lb01/frontendIPConfigurations/ipconfig1
for AzureID /subscriptions/1111/resourceGroups/rg1/providers/Microsoft.Network/loadBalancers/lb01/frontendIPConfigurations/ipconfig1 for the old regex
\/subscriptions\/(.+)\/resourceGroups\/(.+)\/providers\/(.+?)\/(.+?)\/(.+) which isn't correct
why we drop it
not currently dropped, but could also do that too, just did a bit of future proofing incase we need it
There was a problem hiding this comment.
removed redundant validation check in parseAzureID and removed childResourceTypeNamePairs field from azureID struct because not currently used
added comments to explain
There was a problem hiding this comment.
manually verified provisioning / GetVmSpecs still works as expected
132b246 to
b11bb0d
Compare
pkg/roachprod/vm/azure/azure.go
Outdated
| azureVmId, err := parseAzureID(vmInstance.ProviderID) | ||
| if err != nil { | ||
| return nil, err | ||
| } else if azureVmId.resourceGroup == "" { |
There was a problem hiding this comment.
else ifs are redundant since control exits in each case. Humans prefer straight-line code.
pkg/roachprod/vm/azure/azure.go
Outdated
| return nil, errors.Errorf("resource name not found in azure id: %s", azureVmId.String()) | ||
| } | ||
|
|
||
| // expand with InstanceViewTypesInstanceView and InstanceViewTypesUserData |
There was a problem hiding this comment.
Misplaced comment? The line immediately below it just creates a client.
There was a problem hiding this comment.
oops those were my notes for code that no longer exists
|
after looking into Azure resource IDs more while thinking about the error handling, noticed that the parsing and struct wouldn't work with child resource ids, so made the pattern more flexible and adjusted the struct. see description and code comments for more details |
| }) | ||
|
|
||
| // Manual test for verifying framework behavior in a test success scenario | ||
| r.Add(registry.TestSpec{ |
There was a problem hiding this comment.
but if you feel strongly about it i can take it out
Just a nit, I don't feel that strongly about it if you think it improves your workflow.
Also if there's a way to run the framework in a "test no op" mode
I usually just run one of the acceptance tests. Something like acceptance/build-info takes about 5 seconds longer to run than a noop test and goes through spinning up a CRDB cluster; doesn't really matter for your change here but it might catch a bug in other changes.
pkg/roachprod/vm/azure/azure.go
Outdated
| azureVmId, err := parseAzureID(vmInstance.ProviderID) | ||
| if err != nil { | ||
| return nil, err | ||
| } else if azureVmId.resourceGroup == "" { |
There was a problem hiding this comment.
I'm looking the regex:
/subscriptions/([^/]+)/resourceGroups/([^/]+)/providers/([^/]+)/([^/]+)/([^/]+)(?:/(.*))?)`
And it seems like all the fields are guaranteed to be non-empty (assuming FindStringSubmatch doesn't return nil) except for the final child resource group. So we just need to check if that one is empty or not.
That being said, I'm not really sure what the child resource type does? Do we need it? Or can we just leave a comment explaining why we drop it?
pkg/roachprod/vm/azure/ids_test.go
Outdated
| name string | ||
| input string | ||
| expectErr bool | ||
| want azureID |
There was a problem hiding this comment.
super nit: expected and actual is more idiomatic for our codebase
DarrylWong
left a comment
There was a problem hiding this comment.
Wouldn't hurt to kick off a SELECT_PROBABILITY=0.1 smoke test on TC since the regex pattern changed, but otherwise lgtm.
25d5dc4 to
bff1d3c
Compare
|
Rerunning Smoke test (azure tc job working again): https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestNightlyAzureBazel?buildTypeTab=overview&branch=154467&mode=builds |
|
Smoke Test Failure Looks like If my understanding is correct, this doesn't seem to be great, but it looks unrelated to Will re-run the smoke |
|
The unrecovered panic should be handled by #155981 which this PR doesn't include. |
|
Ah thanks, will rebase |
Implementation for previously unimplemented interface method `GetVMSpecs()` for Azure. This will allow for cluster information to be fetched in roachtest via `FetchVMSpecs` Informs cockroachdb#146202 Epic: None Release note: None
bff1d3c to
67e3bed
Compare
|
tftr! bors r=DarrylWong |
154467: roachprod: azure GetVMSpecs() implementation r=DarrylWong a=williamchoe3 Informs #146202 Implementation for previously unimplemented interface method `GetVMSpecs()` for Azure. This will allow for cluster information to be fetched in roachtest via `FetchVMSpecs` Also changed `azureIDPattern` to match the optional type name pairs that may exist in a fully qualified azure resource id. See comments for more details. Previously, if these optional type name pairs existed, they were just a part of the `resourceName` field. Currently, I didn't see any of these child resource ids being parsed, but if we were to ever parse those in the future the previous implementation wouldn't make sense. * Note: Although the regex selects child resource type and names in the last matching group, they are not saved to AzureId struct because they are not being used For Azure, `client.Get(...)` allows you to pass in an optional `expander` parameter which seems to provide additional information for `"instanceView"` and runs an optionally passed in user script (passed in on vm creation) when `"userData"` is passed in. * `"userData"` doesn't seem relevant for us "retrieves the UserData property as part of the VM model view that was provided by the user during the VM Create/Update operation" https://learn.microsoft.com/en-us/rest/api/compute/virtual-machines/get?view=rest-compute-2025-04-01&tabs=HTTP * "The `instanceView` of an Azure Virtual Machine (VM) provides a snapshot of the runtime properties and status of the VM" https://learn.microsoft.com/en-us/rest/api/compute/virtual-machines/instance-view?view=rest-compute-2025-04-01&tabs=HTTP Seems nice, but there's no diff in the json returned by `"instanceView"` and the default when `expander` is not set (referred to `model view` in some places on thier docs) ```go InstanceViewTypesInstanceView InstanceViewTypes = "instanceView" InstanceViewTypesUserData InstanceViewTypes = "userData" ``` IBM PR: #155368 155368: roachprod: implement GetVMSpecs for IBM r=DarrylWong,golgeek a=williamchoe3 Informs #146202 Implementation for previously unimplemented interface method GetVMSpecs() for IBM. This will allow for cluster information to be fetched in roachtest via FetchVMSpecs * API: https://cloud.ibm.com/apidocs/resource-controller/resource-controller?code=go#get-resource-instance Also added comments to help contextualize some parts that I thought would be helpful while doing some debug. See comment for e.g. vm specs 156504: sql: enable test tenants r=yuzefovich a=yuzefovich **sql: remove some redundant kv.TestingIsRangeLookupRequest calls in tests** These were redundant because we have GetRequests in all call sites. **sql: enable test tenants** Some tests have been adjusted to work with test tenants, others have specific issues to investigate further. Fixes: #143114 Epic: CRDB-48945 156547: sql: add ColumnGeneratedAsIdentity as a new schema change element r=shghasemi a=shghasemi Previously, GeneratedAsIdentity was part of the Column element. Creating a new schema change element will allow the declarative schema changer to add/remove/update such columns without dropping the Column element. Epic: [CRDB-31283](https://cockroachlabs.atlassian.net/browse/CRDB-31283) Part of: #142918 Release note: None 156940: util/taskset: add generic work distribution mechanism r=spilchen a=spilchen Adds a new taskset package that provides task coordination for a number of workers. TaskSet hands out integer identifiers (TaskIDs) that workers claim and process, with the caller responsible for mapping TaskIDs to actual work (file indices, key ranges, batches, etc.). Features round-robin initial distribution across workers and locality when getting a task ID. Not thread-safe; callers provide external synchronization. This will be used by the new distributed merge pipeline. Closes #156578 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` 156987: sql: remove now-stale growstack for internal executor r=yuzefovich a=yuzefovich As of 4a5c503, we now grow stack unconditionally for all async tasks. Some time ago we added an ability to growstack in the async task of the internal executor when used in the LDR, and now that has become redundant, so we remove this special case. Epic: None Release note: None Co-authored-by: William Choe <williamchoe3@gmail.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Shadi Ghasemitaheri <shadi.ghasemitaheri@cockroachlabs.com> Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
|
Build failed (retrying...): |
|
Build succeeded: |
Informs #146202
Implementation for previously unimplemented interface method
GetVMSpecs()for Azure. This will allow for cluster information to be fetched in roachtest viaFetchVMSpecsAlso changed
azureIDPatternto match the optional type name pairs that may exist in a fully qualified azure resource id. See comments for more details. Previously, if these optional type name pairs existed, they were just a part of theresourceNamefield. Currently, I didn't see any of these child resource ids being parsed, but if we were to ever parse those in the future the previous implementation wouldn't make sense.For Azure,
client.Get(...)allows you to pass in an optionalexpanderparameter which seems to provide additional information for"instanceView"and runs an optionally passed in user script (passed in on vm creation) when"userData"is passed in."userData"doesn't seem relevant for us "retrieves the UserData property as part of the VM model view that was provided by the user during the VM Create/Update operation" https://learn.microsoft.com/en-us/rest/api/compute/virtual-machines/get?view=rest-compute-2025-04-01&tabs=HTTPinstanceViewof an Azure Virtual Machine (VM) provides a snapshot of the runtime properties and status of the VM" https://learn.microsoft.com/en-us/rest/api/compute/virtual-machines/instance-view?view=rest-compute-2025-04-01&tabs=HTTP Seems nice, but there's no diff in the json returned by"instanceView"and the default whenexpanderis not set (referred tomodel viewin some places on thier docs)IBM PR: #155368