Skip to content

Applying patch from https://github.com/moby/buildkit/pull/2369#3

Merged
jmacelroy merged 2 commits intov0.8.3-okteto8from
jdm/v0.8.3-okteto8
Oct 19, 2021
Merged

Applying patch from https://github.com/moby/buildkit/pull/2369#3
jmacelroy merged 2 commits intov0.8.3-okteto8from
jdm/v0.8.3-okteto8

Conversation

@jmacelroy
Copy link
Copy Markdown

closes moby#2367
closes moby#2088

hosts mutex is called on initialization, meaning GetResolver might
block if it is in the middle of auth exchange. This is currently bad
in the case where the Job initialization needs to register a name before
timeout is reached.

Authorizer stores the current session.Group so if it is
overwritten for another resolver it means that session might
have been dropped and authentication will fail. I haven't been able
to reproduce this issue so not 100% it fixes the case but even if it is
a different race it looks incorrect anyway.

For the Job registration, we might want to do some follow-up change
that avoids running things before job ID is registered.

Jacob MacElroy added 2 commits October 18, 2021 14:38
closes moby#2088

hosts mutex is called on initialization, meaning GetResolver might
block if it is in the middle of auth exchange. This is currently bad
in the case where the Job initialization needs to register a name before
timeout is reached.

Authorizer stores the current session.Group so if it is
overwritten for another resolver it means that session might
have been dropped and authentication will fail. I haven't been able
to reproduce this issue so not 100% it fixes the case but even if it is
a different race it looks incorrect anyway.

For the Job registration, we might want to do some follow-up change
that avoids running things before job ID is registered.
@jmacelroy
Copy link
Copy Markdown
Author

We can't go with v0.9.0 release due to the 401 issue that we hit often with preview environments. With this commit the intention is to pull in the fix for moby#2367 which is affecting Sirona and apply it to our custom v0.8.3 branch that is now more custom.

I have run some simple tests on my dev machine using us.gcr.io/development-300207/buildkit:v0.8.3-okteto8-rootless

Feel free to pull down a us.gcr image @pchico83 and once this is approved I'll merge and create a build for hub.docker.com

Copy link
Copy Markdown

@pchico83 pchico83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacelroy not sure if we should release it to cloud, or send the new image to sirona to validate if it's working first

@ifbyol
Copy link
Copy Markdown
Member

ifbyol commented Oct 19, 2021

@jmacelroy not sure if we should release it to cloud, or send the new image to sirona to validate if it's working first

@pchico83 before sending it to Sirona, maybe we can update staging and try some website preview deployments to make sure it keeps working and we don't face new (different) issues.

@pchico83
Copy link
Copy Markdown

@jmacelroy not sure if we should release it to cloud, or send the new image to sirona to validate if it's working first

@pchico83 before sending it to Sirona, maybe we can update staging and try some website preview deployments to make sure it keeps working and we don't face new (different) issues.

Right, what I mean is that we don't need an OE release to send it to Sirona. Nor even updating Cloud with it.

@jmacelroy
Copy link
Copy Markdown
Author

@jmacelroy not sure if we should release it to cloud, or send the new image to sirona to validate if it's working first

@pchico83 before sending it to Sirona, maybe we can update staging and try some website preview deployments to make sure it keeps working and we don't face new (different) issues.

Right, what I mean is that we don't need an OE release to send it to Sirona. Nor even updating Cloud with it.

While the change doesn't seem high risk I also don't want to cause churn for Sirona if this build has issues. I'd like to try out some more manual testing and I'd like to soak in staging at least a day. I'll mention this to Sirona and let them decide, I know they want the fix but I don't know how aggressive they want to be. What do you think @pchico83 @ifbyol ?

@jmacelroy jmacelroy merged commit 8682c99 into v0.8.3-okteto8 Oct 19, 2021
@pchico83
Copy link
Copy Markdown

@jmacelroy not sure if we should release it to cloud, or send the new image to sirona to validate if it's working first

@pchico83 before sending it to Sirona, maybe we can update staging and try some website preview deployments to make sure it keeps working and we don't face new (different) issues.

Right, what I mean is that we don't need an OE release to send it to Sirona. Nor even updating Cloud with it.

While the change doesn't seem high risk I also don't want to cause churn for Sirona if this build has issues. I'd like to try out some more manual testing and I'd like to soak in staging at least a day. I'll mention this to Sirona and let them decide, I know they want the fix but I don't know how aggressive they want to be. What do you think @pchico83 @ifbyol ?

Perfect!

@rberrelleza
Copy link
Copy Markdown
Member

There is already a lot of discussions inside of Sirona around Okteto's stability. Anything we can do on our side to mitigate this would be great.

I agree with @jmacelroy idea of putting it first in cloud (assuming we can monitor build failure rates or something like that) to increase our confidence that this change won't cause regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants