Skip to content

Conversation

@kiashok
Copy link
Contributor

@kiashok kiashok commented Jul 5, 2023

This PR does the following:

  • Uses the function introduced by Add support for platform compatibility check for windows microsoft/hcsshim#1821 to invoke stable ABI compliant function in windows platform matcher. (In this PR, this change is limited to platforms/default_windows.go file. platform/default_windows_test.go file has some new tests added).
    Exact platform match is required for OS Versions < WS2022. The exact version match is not needed in versions > WS2022 due to stable ABI compliance.

@k8s-ci-robot
Copy link

Hi @kiashok. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kiashok kiashok changed the title Invoke stable ABI compliant platform check in windows matcher Invoke stable ABI compliant platform check in windows matcher + update hcsshim tag Jul 5, 2023
@kiashok
Copy link
Contributor Author

kiashok commented Jul 5, 2023

@jterry75 @dmcgowan @jsturtevant @dcantah could you please take a look at this PR when you have some time please?

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from 6fbde64 to 63bc24c Compare July 5, 2023 21:52
@dcantah
Copy link
Member

dcantah commented Jul 6, 2023

@kiashok Can you check the project check go.mod errors?

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch 2 times, most recently from c0a88ac to 78fd736 Compare July 11, 2023 23:29
@dcantah
Copy link
Member

dcantah commented Jul 12, 2023

The project checks are still unhappy with the vendoring

@kiashok
Copy link
Contributor Author

kiashok commented Jul 12, 2023

The project checks are still unhappy with the vendoring

go mod tidy is messing with the versions after making the change to integration/client/go.mod file . Taking a look

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch 2 times, most recently from 24f74bd to e6bb2e7 Compare July 13, 2023 17:27
@kiashok
Copy link
Contributor Author

kiashok commented Jul 13, 2023

The project checks are still unhappy with the vendoring

resolved :)

@kiashok
Copy link
Contributor Author

kiashok commented Jul 13, 2023

@dmcgowan could you take a look when you have some time please?

@kiashok
Copy link
Contributor Author

kiashok commented Jul 13, 2023

@estesp the test failures seem to be unrelated to my change. Could you please restart them?

@dcantah
Copy link
Member

dcantah commented Jul 13, 2023

I restarted them, but the failures odd that it's every time for all windows runs

@kiashok
Copy link
Contributor Author

kiashok commented Jul 13, 2023

I restarted them, but the failures odd that it's every time for all windows runs

Created a draft PR with just the hcsshim tag update changes and the same test seems to fail: #8811. Pretty unrelated to my change. Investigating..

@kiashok
Copy link
Contributor Author

kiashok commented Jul 14, 2023

I restarted them, but the failures odd that it's every time for all windows runs

Created a draft PR with just the hcsshim tag update changes and the same test seems to fail: #8811. Pretty unrelated to my change. Investigating..

@dcantah doesn't seem like these are related to any of the changes in this PR. Is there a way to kick off the windows integration runs for existing current main branch to see if the tests pass or are flaky there?

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from e6bb2e7 to 31d7205 Compare July 18, 2023 00:42
@kiashok kiashok changed the title Invoke stable ABI compliant platform check in windows matcher + update hcsshim tag Invoke stable ABI compliant platform check in windows matcher Jul 18, 2023
@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from 31d7205 to 0e83df7 Compare July 18, 2023 23:00
@kiashok
Copy link
Contributor Author

kiashok commented Jul 19, 2023

@dmcgowan Hi Derek there are some test failures in this PR related to the tests you added in import_test.go with #7964 . I see that this image is creating image with linux as the OS. Typically these TestTransferImport() functions should be skipped for windows OS runs right? Is my understanding correct? The tests are failing with some race condition while calling client.Transfer() and I am not sure how my changes are triggering this failures and was looking into this.

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from 0e83df7 to de8f2b2 Compare July 19, 2023 18:29
@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch 2 times, most recently from 768a0ce to 2d3061d Compare July 21, 2023 04:10
@kiashok
Copy link
Contributor Author

kiashok commented Jul 21, 2023

@dmcgowan Hi Derek there are some test failures in this PR related to the tests you added in import_test.go with #7964 . I see that this image is creating image with linux as the OS. Typically these TestTransferImport() functions should be skipped for windows OS runs right? Is my understanding correct? The tests are failing with some race condition while calling client.Transfer() and I am not sure how my changes are triggering this failures and was looking into this.

The test failures here definitely seem to be a race in the logrus or transfer service and I've opened a bug here - #8853 . This is an already existing race condition which is just being exposed due to the slightly longer time that WindowsPlatformMatcher.Match() will take with the new changes. The TestMain() in integration/client/client_test.go calls client.Pull() in https://github.com/containerd/containerd/blob/main/integration/client/client_test.go#L152 where the platform matcher will be invoked before calling m.Run() that starts the different Transfer service tests. While I continue to take a look at how this data race can be resolved, could we unblock this PR as it is not causing these failures? Thoughts @jterry75 @dcantah ?

@kiashok
Copy link
Contributor Author

kiashok commented Jul 21, 2023

@dmcgowan Hi Derek there are some test failures in this PR related to the tests you added in import_test.go with #7964 . I see that this image is creating image with linux as the OS. Typically these TestTransferImport() functions should be skipped for windows OS runs right? Is my understanding correct? The tests are failing with some race condition while calling client.Transfer() and I am not sure how my changes are triggering this failures and was looking into this.

The test failures here definitely seem to be a race in the logrus or transfer service and I've opened a bug here - #8853 . This is an already existing race condition which is just being exposed due to the slightly longer time that WindowsPlatformMatcher.Match() will take with the new changes. The TestMain() in integration/client/client_test.go calls client.Pull() in https://github.com/containerd/containerd/blob/main/integration/client/client_test.go#L152 where the platform matcher will be invoked before calling m.Run() that starts the different Transfer service tests. While I continue to take a look at how this data race can be resolved, could we unblock this PR as it is not causing these failures? Thoughts @jterry75 @dcantah @mikebrow ?

@kiashok
Copy link
Contributor Author

kiashok commented Jul 26, 2023

@dmcgowan Hi Derek there are some test failures in this PR related to the tests you added in import_test.go with #7964 . I see that this image is creating image with linux as the OS. Typically these TestTransferImport() functions should be skipped for windows OS runs right? Is my understanding correct? The tests are failing with some race condition while calling client.Transfer() and I am not sure how my changes are triggering this failures and was looking into this.

The test failures here definitely seem to be a race in the logrus or transfer service and I've opened a bug here - #8853 . This is an already existing race condition which is just being exposed due to the slightly longer time that WindowsPlatformMatcher.Match() will take with the new changes. The TestMain() in integration/client/client_test.go calls client.Pull() in https://github.com/containerd/containerd/blob/main/integration/client/client_test.go#L152 where the platform matcher will be invoked before calling m.Run() that starts the different Transfer service tests. While I continue to take a look at how this data race can be resolved, could we unblock this PR as it is not causing these failures? Thoughts @jterry75 @dcantah @mikebrow ?

Ping on this :)

@mikebrow
Copy link
Member

mikebrow commented Aug 8, 2023

restarted failed buckets..

@kiashok
Copy link
Contributor Author

kiashok commented Aug 8, 2023

restarted failed buckets..

the transfer service tests that are failing due to data race in transfer service do not even run for windows (checked recent PRs that are open and merged). I am not sure how a change in windows platform matcher is causing these tests to run on windows. Seems pretty unrelated to the changes I am making. Could this PR be merged as the changes are unrelated?
Opened a bug in containerd to track this #8853 and will look into this hopefully this week or the next.

cc @dmcgowan

@mikebrow
Copy link
Member

mikebrow commented Aug 8, 2023

you could rebase to pick up the latest ci/cd

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from 2d3061d to 14e61d5 Compare August 9, 2023 16:30
@kiashok
Copy link
Contributor Author

kiashok commented Aug 9, 2023

you could rebase to pick up the latest ci/cd

Hi @mikebrow , the failure does not show up as data race anymore. It looks like there is some trouble loading transfer service and there is a premature closure of the pipe? Has anything changed in transfer service lately? Would really appreciate your help with this!

image

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from 14e61d5 to f49947d Compare August 10, 2023 16:48
@kiashok
Copy link
Contributor Author

kiashok commented Aug 10, 2023

you could rebase to pick up the latest ci/cd

Hi @mikebrow , the failure does not show up as data race anymore. It looks like there is some trouble loading transfer service and there is a premature closure of the pipe? Has anything changed in transfer service lately? Would really appreciate your help with this!

image

@mikebrow there is a still a data race after rebasing this PR. One strange thing I do see is that TestTransferImport (oneo of the failing test) is being skipped for windows tests - https://github.com/containerd/containerd/actions/runs/5813022917/job/15759858411#step:19:104 as testing.short() evaluates to true. But in this PR its attempting to run these tests and I am not sure why (as this PR is doing nothing to change behavior). This makes me think that the data race showing up in PR test failures existed from day 1?

Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch 2 times, most recently from 2789a8e to bcc8269 Compare August 15, 2023 20:22
@kiashok
Copy link
Contributor Author

kiashok commented Aug 15, 2023

@mikebrow the windows integration test failures have now been fixed. The CI/Binaries (windows-2022, 1.19.12) says failed but when I access the logs, I do not see any failures/errors being reported. Do you know why it could be flagging the test suites as failed? Could you please restart this suite if needed?

@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from bcc8269 to 884af45 Compare August 15, 2023 21:30
@kiashok
Copy link
Contributor Author

kiashok commented Aug 15, 2023

@mikebrow the windows integration test failures have now been fixed. The CI/Binaries (windows-2022, 1.19.12) says failed but when I access the logs, I do not see any failures/errors being reported. Do you know why it could be flagging the test suites as failed? Could you please restart this suite if needed?

Please ignore. I was able to do a force push to the PR to restart the tests without intervention. Thanks!

- Fill OSVersion field of ocispec.Platform for windows OS in
transfer service plugin init()
- Do not return error from transfer service ReceiveStream if
stream.Recv() returned context.Canceled error

Signed-off-by: Kirtana Ashok <kiashok@microsoft.com>
@kiashok kiashok force-pushed the stableABI_WindowsMatcher branch from 884af45 to 823e042 Compare August 15, 2023 22:33
Copy link
Contributor

@jsturtevant jsturtevant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one minor comment about test scenarios.

This looks like it would also fix #8563 with option number 3 in the issue (repopluate OS version during Parse). This works for now and we can address the Hyper-v scenarios later.

@kiashok
Copy link
Contributor Author

kiashok commented Aug 15, 2023

All CI runs are now passing

Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/1.7.x PR commits are cherry-picked into release/1.7 branch needs-ok-to-test platform/windows Windows

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

9 participants