OCPBUGS-19400,OCPBUGS-21721: install/0000_90_machine-config-operator_90_deletion: Drop this file#3983
Conversation
|
@wking: This pull request references Jira Issue OCPBUGS-10924, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
ace637f (OCPBUGS-10924: Switch default SA to machine-config-operator, 2023-06-23, openshift#3740) moved the 4.14 machine-config operator to a non-default ServiceAccount and ClusterRoleBinding. But 4.13 and earlier remain on the default ServiceAccount. 1cdb75f (install: Recreate and delayed default ServiceAccount deletion, 2023-09-19, openshift#3923, OCPBUGS-19400) brought Recreate logic back to 4.13.14 [1] and later (good), but also brought back a 'delete' manifest for the default ClusterRoleBinding, which leads to the 4.13 cluster-version operator fighting with itself over whether that ClusterRoleBinding should exist (it should exist on 4.13) [2]. For example, [3] updates from 4.12.36 to 4.13.14, and has: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1707415968109563904/artifacts/e2e-aws-upgrade/clusterversion.json | jq -r '.items[].status.conditions[] | select(.type == "Upgradeable") | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2023-09-28T17:09:41Z Upgradeable=False ResourceDeletesInProgress: Cluster minor level upgrades are not allowed while resource deletions are in progress; resources=clusterrolebinding "default-account-openshift-machine-config-operator" By dropping the deletion manifest from 4.13, we avoid contention between two manifests, and leave the default ClusterRoleBinding alone until a later update to 4.14 will remove it. [1]: https://amd64.ocp.releases.ci.openshift.org/releasestream/4-stable/release/4.13.14 [2]: https://issues.redhat.com/browse/OCPBUGS-21721 [3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1707415968109563904
7a4c537 to
58a087b
Compare
|
@wking: This pull request references Jira Issue OCPBUGS-21721, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/jira refresh |
|
@wking: This pull request references Jira Issue OCPBUGS-21721, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Ignore the bootstrap-unit failure, I caused that via a previous PR and haven't had a chance to fix it. Regardless, not a blocking job |
|
/jira refresh |
|
@wking: This pull request references Jira Issue OCPBUGS-21721, which is valid. The bug has been moved to the POST state. 6 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/label cherry-pick-approved |
|
pre-merge testing, build ci image with this PR $ cv version -o yaml | yq -y '.status.conditions'
- lastTransitionTime: '2023-10-17T02:24:27Z'
message: 'Unable to retrieve available updates: currently reconciling cluster version
4.13.0-0.ci.test-2023-10-17-020658-ci-ln-ztgfry2-latest not found in the "stable-4.12"
channel'
reason: VersionNotFound
status: 'False'
type: RetrievedUpdates
- lastTransitionTime: '2023-10-17T02:50:37Z'
message: Capabilities match configured spec
reason: AsExpected
status: 'False'
type: ImplicitlyEnabledCapabilities
- lastTransitionTime: '2023-10-17T02:24:27Z'
message: Payload loaded version="4.13.0-0.ci.test-2023-10-17-020658-ci-ln-ztgfry2-latest"
image="registry.build05.ci.openshift.org/ci-ln-ztgfry2/release:latest" architecture="amd64"
reason: PayloadLoaded
status: 'True'
type: ReleaseAccepted
- lastTransitionTime: '2023-10-17T02:43:14Z'
message: Done applying 4.13.0-0.ci.test-2023-10-17-020658-ci-ln-ztgfry2-latest
status: 'True'
type: Available
- lastTransitionTime: '2023-10-17T03:56:12Z'
status: 'False'
type: Failing
- lastTransitionTime: '2023-10-17T03:58:27Z'
message: Cluster version is 4.13.0-0.ci.test-2023-10-17-020658-ci-ln-ztgfry2-latest
status: 'False'
type: Progressingupgrade history $ cv version -o yaml | yq -y '.status.history'
- acceptedRisks: 'Target release version="" image="registry.build05.ci.openshift.org/ci-ln-ztgfry2/release:latest"
cannot be verified, but continuing anyway because the update was forced: release
images that are not accessed via digest cannot be verified
Forced through blocking failures: Multiple precondition checks failed:
* Precondition "ClusterVersionUpgradeable" failed because of "AdminAckRequired":
Kubernetes 1.26 and therefore OpenShift 4.13 remove several APIs which require
admin consideration. Please see the knowledge article https://access.redhat.com/articles/6958394
for details and instructions.
* Precondition "EtcdRecentBackup" failed because of "ControllerStarted": RecentBackup:
The etcd backup controller is starting, and will decide if recent backups are
available or if a backup is required
* Precondition "ClusterVersionRecommendedUpdate" failed because of "UnknownUpdate":
RetrievedUpdates=False (VersionNotFound), so the recommended status of updating
from 4.12.38 to 4.13.0-0.ci.test-2023-10-17-020658-ci-ln-ztgfry2-latest is unknown.'
completionTime: '2023-10-17T03:58:27Z'
image: registry.build05.ci.openshift.org/ci-ln-ztgfry2/release:latest
startedTime: '2023-10-17T02:50:34Z'
state: Completed
verified: false
version: 4.13.0-0.ci.test-2023-10-17-020658-ci-ln-ztgfry2-latest
- completionTime: '2023-10-17T02:43:14Z'
image: quay.io/openshift-release-dev/ocp-release@sha256:09e50f5d863fdcecdb4a7e3fe2c78e5bec992f10be032acadc317c0d66c79700
startedTime: '2023-10-17T02:24:27Z'
state: Completed
verified: false
version: 4.12.38check default clusterrolebinding $ oc get clusterrolebinding/default-account-openshift-machine-config-operator -n openshift-machine-config-operator
NAME ROLE AGE
default-account-openshift-machine-config-operator ClusterRole/cluster-admin 118m |
|
this makes sense. Thanks Trevor |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sinnykumari, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@wking: Jira Issue OCPBUGS-21721: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-21721 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retitle OCPBUGS-19400,OCPBUGS-21721: install/0000_90_machine-config-operator_90_deletion: Drop this file |
|
@wking: Jira Issue OCPBUGS-19400: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-19400 has been moved to the MODIFIED state. Jira Issue OCPBUGS-21721 is in an unrecognized state (MODIFIED) and will not be moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/jira refresh |
|
@rioliu-rh: Jira Issue OCPBUGS-19400 is in an unrecognized state (MODIFIED) and will not be moved to the MODIFIED state. Jira Issue OCPBUGS-21721 is in an unrecognized state (MODIFIED) and will not be moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
ace637f (#3740, OCPBUGS-10924) moved the 4.14 machine-config operator to a non-default ServiceAccount and ClusterRoleBinding. But 4.13 and earlier remain on the default ServiceAccount.
1cdb75f (#3923, OCPBUGS-19400) brought
Recreatelogic back to 4.13.14 and later (good), but also brought back adeletemanifest for the default ClusterRoleBinding, which leads to the 4.13 cluster-version operator fighting with itself over whether that ClusterRoleBinding should exist (it should exist on 4.13, OCPBUGS-21721). For example, this CI run updates from 4.12.36 to 4.13.14, and has:- What I did
By dropping the deletion manifest from 4.13, we avoid contention between two manifests, and leave the default ClusterRoleBinding alone until a later update to 4.14 will remove it.
- How to verify it
Update from 4.12 to 4.13. Confirm that the default ClusterRoleBinding exists and that ClusterVersion's
Upgradeableis not complaining aboutresource deletions are in progress.- Description for the changelog
Are we doing an MCO change-log? I'd expect this to get handled via Jira fields.