Skip to content

[ACR] check-health: add support to verify dns routings to private endpoints #17746

Merged
fengzhou-msft merged 17 commits intoAzure:devfrom
yugangw-msft:pe-health
Apr 26, 2021
Merged

[ACR] check-health: add support to verify dns routings to private endpoints #17746
fengzhou-msft merged 17 commits intoAzure:devfrom
yugangw-msft:pe-health

Conversation

@yugangw-msft
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toddysm, please check out the description and error messages below and see whether it makes sense and friendly to users

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yugangw-msft Correct me if I'm wrong - please double-check the usage of VNET throughout the code. Users are expected to provide the Subnet Resource ID or (Subnet name and Vnet name). If they provide the VNET ID, the code may not work since subnetID != VnetID.

Vnet Resource ID: /subscriptions/{subID}/resourceGroups/{resourceGroup}/providers/Microsoft.Network/virtualNetworks/{vnetName}

Subnet Resource ID:
/subscriptions/{subId}/resourceGroups/adupadhyRG/providers/Microsoft.Network/virtualNetworks/{vnetName}/subnets/{subnetName}

Usually, subnet name is default but it could be different based on what subnet users create.

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For texts and command line args, maybe we can use az network private-endpoint create -h for reference.

image
image

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--subnet is more accurate, which I can follow your suggestion.
The context of only matching vnet is having 2+ PE in the same vnet (even different subnet) is an anti-pattern we want to capture, as dns routing might become unpredictable.
So, it is your call. Let me know we go with vnet or subnet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, let's go with VNET.

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is an instantiation of network_client? It might be better to instantiate one network client outside the loop and use it for all requests.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is cheap. The reason of initializing each time is PE can be cross tenant which mean the subscription value might be different. We can use dictionary with keying on subscription for cache, but that might be overkill.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I missed the res['subscription'] part.

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry my python knowledge might be a bit rusty. If we're only extracting resource Ids, shouldn't this have an if condition at the end? Does e.private_endpoint and e.private_endpoint.id achieve the same as the code below?

[e.private_endpoint.id for e in registry.private_endpoint_connections if e.private_endpoint]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python's and is a bit different with non boolean value, but I can use your suggestion for better code maintainability

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand why we're using not in here. result will contain IP V4 address string right? dns_mappings[fqdn] is also string not a list. Any reason for not doing if result != dns_mappings[fqdn]:?

Also, if an invalid/non-existing FQDN is passed to socket.gethostbyname, it leads to an error. Should the error be caught?

image

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address both comments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a regex module might be overkill. Can we do vnet_of_private_endpoint.lower() == pe.subnet.id.lower()?

Also, variable name subnet_of_private_endpoint might avoid confusions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you used regex module here. re.match will return non-null object if vnet resource ID is a part of subnet resource ID. I would do a simple prefix matching to make it more readable but either of the two options is fine.

pe.subnet.id.lower().startswith(vnet_of_private_endpoint.lower())

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yugangw-msft Correct me if I'm wrong - please double-check the usage of VNET throughout the code. Users are expected to provide the Subnet Resource ID or (Subnet name and Vnet name). If they provide the VNET ID, the code may not work since subnetID != VnetID.

Vnet Resource ID: /subscriptions/{subID}/resourceGroups/{resourceGroup}/providers/Microsoft.Network/virtualNetworks/{vnetName}

Subnet Resource ID:
/subscriptions/{subId}/resourceGroups/adupadhyRG/providers/Microsoft.Network/virtualNetworks/{vnetName}/subnets/{subnetName}

Usually, subnet name is default but it could be different based on what subnet users create.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, like I mentioned, we mainly care about the vnet

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yugangw-msft Do you think we should validate if the user has entered a valid VNET resource ID? Right now, we only check if it's a valid resource ID.

For instance, if a user (by mistake) provides a non-VNET resource ID, the current code flow will log an error: Registry ... doesn't have private endpoints in the vnet of {nonVnetResourceId}

If invalid, we can request user to enter the VNET resource ID in the format /subscriptions/{subID}/resourceGroups/{resGroup}/providers/Microsoft.Network/virtualNetworks/{Vnet}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--subnet is more accurate, which I can follow your suggestion.
The context of only matching vnet is having 2+ PE in the same vnet (even different subnet) is an anti-pattern we want to capture, as dns routing might become unpredictable.
So, it is your call. Let me know we go with vnet or subnet

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, like I mentioned, we mainly care about the vnet

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python's and is a bit different with non boolean value, but I can use your suggestion for better code maintainability

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is cheap. The reason of initializing each time is PE can be cross tenant which mean the subscription value might be different. We can use dictionary with keying on subscription for cache, but that might be overkill.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address both comments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, let's go with VNET.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I missed the res['subscription'] part.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you used regex module here. re.match will return non-null object if vnet resource ID is a part of subnet resource ID. I would do a simple prefix matching to make it more readable but either of the two options is fine.

pe.subnet.id.lower().startswith(vnet_of_private_endpoint.lower())

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question only: Do you see any value in printing the DNS mappings for ok scenarios?

registry.azurecr.io             10.0.0.1
registry.eastus.data.azurecr.io 10.0.0.2
registry.westus.data.azurecr.io 10.0.0.3

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we only have stderr for non structured info(stdout is for json data), I suggest keep it terse so error can be spotted easily.
However if feedback shows users don't get the error easily, we can incorporate errors in the pretty formating

registry.azurecr.io                        10.0.0.1
registry.eastus.data.azurecr.io            10.0.0.2
registry.westus.data.azurecr.io            10.0.0.3
registry.westeurope.data.azurecr.io    (dns routing error)

Copy link
Copy Markdown
Contributor

@hkuadithya hkuadithya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yugangw-msft Overall LGTM. Last two things.

  1. Please confirm the Help text and description for the --vnet command line arg
  2. Please evaluate if we need to validate if vnet_resource_id is a valid VNET resource id. Right now, we only verify if it's a valid ARM resource ID.

@yugangw-msft
Copy link
Copy Markdown
Contributor Author

@fengzhou-msft, please review, and merge if you are happy

@fengzhou-msft fengzhou-msft merged commit 7b7f326 into Azure:dev Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants