Skip to content

sources/azure: ensure retries on IMDS request failure#1271

Merged
TheRealFalcon merged 1 commit intocanonical:mainfrom
cjp256:azure-fix-imds-retry
Feb 18, 2022
Merged

sources/azure: ensure retries on IMDS request failure#1271
TheRealFalcon merged 1 commit intocanonical:mainfrom
cjp256:azure-fix-imds-retry

Conversation

@cjp256
Copy link
Copy Markdown
Contributor

@cjp256 cjp256 commented Feb 16, 2022

sources/azure: ensure retries on IMDS request failure

There are two issues with IMDS retries:

  1. IMDS_VER_WANT will never be attempted if retries=0, such as
    when fetching network metadata with infinite=True.

  2. get_imds_data_with_api_fallback() will attempt one request with
    IMDS_VER_WANT. If the connection fails due to a timeout, connection
    issue, or error code other than 400, an empty dictionary will be
    returned without attempting the requested number of retries.

This PR:

  • Updates get_imds_data_with_api_fallback() to invoke
    get_metadata_from_imds() with the specified retries and infinite
    parameters.

  • Updates retry_on_url_exc to take a configurable set of HTTP error
    codes and exception types to retry on.

  • Add IMDS_RETRY_CODES set to retry with when fetching data from IMDS:

    • 404 not found (yet)
    • 410 gone / unavailable (yet)
    • 429 rate-limited/throttled
    • 500 server error
  • Replace default callback with imds_readurl_exception_callback,
    which configures retry_on_url_exc() with these error codes and
    error instances.

  • Add new pytests for IMDS to eventually replace the unittest
    equivalents and improve existing coverage.

@cjp256 cjp256 force-pushed the azure-fix-imds-retry branch from 384bb5d to 3c4d27c Compare February 16, 2022 18:25
Copy link
Copy Markdown
Contributor

@anhvoms anhvoms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure you update the commit message (currently it mentions 404 only)

@cjp256 cjp256 force-pushed the azure-fix-imds-retry branch 2 times, most recently from d6e2281 to 02edef5 Compare February 16, 2022 19:16
@cjp256
Copy link
Copy Markdown
Contributor Author

cjp256 commented Feb 16, 2022

Make sure you update the commit message (currently it mentions 404 only)

Squashed and updated PR.

@cjp256 cjp256 force-pushed the azure-fix-imds-retry branch from 02edef5 to de47678 Compare February 16, 2022 19:23
There are two issues with IMDS retries:

1. IMDS_VER_WANT will never be attempted if retries=0, such as
when fetching network metadata with infinite=True.

2. get_imds_data_with_api_fallback() will attempt one request with
IMDS_VER_WANT.  If the connection fails due to a timeout, connection
issue, or error code other than 400, an empty dictionary will be
returned without attempting the requested number of retries.

This PR:

- Updates get_imds_data_with_api_fallback() to invoke
get_metadata_from_imds() with the specified retries and infinite
parameters.

- Updates retry_on_url_exc to take a configurable set of HTTP error
codes and exception types to retry on.

- Add IMDS_RETRY_CODES set to retry with when fetching data from IMDS:

  - 404 not found (yet)
  - 410 gone / unavailable (yet)
  - 429 rate-limited/throttled
  - 500 server error

- Replace default callback with imds_readurl_exception_callback,
which configures retry_on_url_exc() with these error codes and
instances.

- Add new pytests for IMDS to eventually replace the unittest
equivalents and improve existing coverage.

Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
@cjp256 cjp256 force-pushed the azure-fix-imds-retry branch from de47678 to c410b9c Compare February 16, 2022 19:40
@cjp256
Copy link
Copy Markdown
Contributor Author

cjp256 commented Feb 16, 2022

I'm happy with the changes now if you are @anhvoms

@cjp256
Copy link
Copy Markdown
Contributor Author

cjp256 commented Feb 17, 2022

@TheRealFalcon if we could get this into the upcoming SRU, it'd be very much appreciated!

@holmanb
Copy link
Copy Markdown
Member

holmanb commented Feb 18, 2022

This looks good to me.

Copy link
Copy Markdown
Contributor

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@TheRealFalcon TheRealFalcon merged commit c1a2047 into canonical:main Feb 18, 2022
@cjp256 cjp256 deleted the azure-fix-imds-retry branch February 18, 2022 02:50
holmanb pushed a commit to holmanb/cloud-init that referenced this pull request Feb 25, 2022
There are two issues with IMDS retries:

1. IMDS_VER_WANT will never be attempted if retries=0, such as
when fetching network metadata with infinite=True.

2. get_imds_data_with_api_fallback() will attempt one request with
IMDS_VER_WANT.  If the connection fails due to a timeout, connection
issue, or error code other than 400, an empty dictionary will be
returned without attempting the requested number of retries.

This PR:

- Updates get_imds_data_with_api_fallback() to invoke
get_metadata_from_imds() with the specified retries and infinite
parameters.

- Updates retry_on_url_exc to take a configurable set of HTTP error
codes and exception types to retry on.

- Add IMDS_RETRY_CODES set to retry with when fetching data from IMDS:

  - 404 not found (yet)
  - 410 gone / unavailable (yet)
  - 429 rate-limited/throttled
  - 500 server error

- Replace default callback with imds_readurl_exception_callback,
which configures retry_on_url_exc() with these error codes and
instances.

- Add new pytests for IMDS to eventually replace the unittest
equivalents and improve existing coverage.

Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
holmanb pushed a commit to holmanb/cloud-init that referenced this pull request Feb 25, 2022
There are two issues with IMDS retries:

1. IMDS_VER_WANT will never be attempted if retries=0, such as
when fetching network metadata with infinite=True.

2. get_imds_data_with_api_fallback() will attempt one request with
IMDS_VER_WANT.  If the connection fails due to a timeout, connection
issue, or error code other than 400, an empty dictionary will be
returned without attempting the requested number of retries.

This PR:

- Updates get_imds_data_with_api_fallback() to invoke
get_metadata_from_imds() with the specified retries and infinite
parameters.

- Updates retry_on_url_exc to take a configurable set of HTTP error
codes and exception types to retry on.

- Add IMDS_RETRY_CODES set to retry with when fetching data from IMDS:

  - 404 not found (yet)
  - 410 gone / unavailable (yet)
  - 429 rate-limited/throttled
  - 500 server error

- Replace default callback with imds_readurl_exception_callback,
which configures retry_on_url_exc() with these error codes and
instances.

- Add new pytests for IMDS to eventually replace the unittest
equivalents and improve existing coverage.

Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
blackboxsw pushed a commit to holmanb/cloud-init that referenced this pull request Feb 25, 2022
There are two issues with IMDS retries:

1. IMDS_VER_WANT will never be attempted if retries=0, such as
when fetching network metadata with infinite=True.

2. get_imds_data_with_api_fallback() will attempt one request with
IMDS_VER_WANT.  If the connection fails due to a timeout, connection
issue, or error code other than 400, an empty dictionary will be
returned without attempting the requested number of retries.

This PR:

- Updates get_imds_data_with_api_fallback() to invoke
get_metadata_from_imds() with the specified retries and infinite
parameters.

- Updates retry_on_url_exc to take a configurable set of HTTP error
codes and exception types to retry on.

- Add IMDS_RETRY_CODES set to retry with when fetching data from IMDS:

  - 404 not found (yet)
  - 410 gone / unavailable (yet)
  - 429 rate-limited/throttled
  - 500 server error

- Replace default callback with imds_readurl_exception_callback,
which configures retry_on_url_exc() with these error codes and
instances.

- Add new pytests for IMDS to eventually replace the unittest
equivalents and improve existing coverage.

Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants