NE 484: Use ingress-operator subcommand instead of hello-openshift by candita · Pull Request #561 · openshift/cluster-ingress-operator

candita · 2021-02-25T21:58:22Z

NE 484: Use ingress-operator subcommand instead of hello-openshift for canary server

Add new subcommand serve-healthcheck
Add servehealthcheck code, same functionality as hello-openshift
Change CANARY_IMAGE to openshift/origin-cluster-ingress-operator:latest
Add a check for Canary Image when computing the operator progressing condition
Update tests

candita · 2021-02-25T23:00:09Z

/retest

candita · 2021-02-26T15:15:04Z

Operator unavailable (OAuthServiceCheckEndpointAccessibleController_EndpointUnavailable::OAuthServiceEndpointsCheckEndpointAccessibleController_EndpointUnavailable::OAuthVersionDeployment_MissingDeployment::ReadyIngressNodes_NoReadyIngressNodes::WellKnown_NotReady): OAuthServiceEndpointsCheckEndpointAccessibleControllerAvailable: Failed to get oauth-openshift enpoints
ReadyIngressNodesAvailable: Authentication requires functional ingress which requires at least one schedulable and ready node. Got 0 worker nodes, 3 master nodes, 0 custom target nodes (none are schedulable or ready for ingress pods).
OAuthServiceCheckEndpointAccessibleControllerAvailable: Get "https://172.30.71.136:443/healthz": dial tcp 172.30.71.136:443: connect: connection refused
WellKnownAvailable: The well-known endpoint is not yet available: failed to get oauth metadata from openshift-config-managed/oauth-openshift ConfigMap: configmap "oauth-openshift" not found (check authentication operator, it is supposed to create this)

/retest

candita · 2021-03-08T16:17:03Z

/retest

candita · 2021-03-08T16:50:10Z

/retest

candita · 2021-03-09T15:20:24Z

fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination: ... Message:"Previous pod kube-apiserver-ip-10-0-236-113.ec2.internal started at 2021-03-08 23:10:05.437889569 +0000 UTC did not terminate gracefully",... Probably kubelet or CRI-O is not giving the time to cleanly shut down. This can lead to connection refused and network I/O timeout errors in other components.
/retest

candita · 2021-03-09T18:03:29Z

Observed 1.0416666666666667 leader changes in 22m55s: Leader changes are a result of stopping the etcd leader process or from latency (disk or network), review etcd performance metrics
occurred
/retest

candita · 2021-03-10T04:56:52Z

fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination
/retest

candita · 2021-03-10T20:19:03Z

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 10 19:27:57.021: API "oauth-api-available-new-connections" was unreachable during disruption for at least 32s of 57m30s (1%):
/retest

candita · 2021-03-10T22:37:15Z

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 10 21:59:37.344: API "kubernetes-api-available-reused-connections" was unreachable during disruption for at least 44s of 1h0m46s (1%):
/retest e2e-upgrade

openshift-ci-robot · 2021-03-10T22:37:29Z

@candita: The /retest command does not accept any targets.
The following commands are available to trigger jobs:

/test e2e-aws
/test e2e-aws-operator
/test e2e-aws-single-node
/test e2e-azure
/test e2e-azure-operator
/test e2e-gcp-operator
/test e2e-upgrade
/test images
/test unit
/test verify

Use /test all to run the following jobs:

pull-ci-openshift-cluster-ingress-operator-master-e2e-aws
pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator
pull-ci-openshift-cluster-ingress-operator-master-e2e-upgrade
pull-ci-openshift-cluster-ingress-operator-master-images
pull-ci-openshift-cluster-ingress-operator-master-unit
pull-ci-openshift-cluster-ingress-operator-master-verify

Details

In response to this:

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 10 21:59:37.344: API "kubernetes-api-available-reused-connections" was unreachable during disruption for at least 44s of 1h0m46s (1%):
/retest e2e-upgrade

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

candita · 2021-03-15T21:46:50Z

Mar 15 19:52:06.573: INFO: cluster upgrade is Progressing: Unable to apply 4.8.0-0.ci.test-2021-03-15-182041-ci-op-ivhwh5xi: an unknown error has occurred: MultipleErrors
Mar 15 19:52:06.573: INFO: cluster upgrade is Failing: Multiple errors are preventing progress:

Could not update clusterrole "cloud-credential-operator-role" (210 of 669)
...
/retest

candita · 2021-03-16T13:50:32Z

Mar 15 23:32:05.850: FAIL: API "oauth-api-available-new-connections" was unreachable during disruption for at least 3s of 1h3m21s (0%):
/retest

candita · 2021-03-16T19:09:16Z

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 16 15:44:32.024: API "openshift-api-available-new-connections" was unreachable during disruption for at least 13s of 1h3m26s (0%):
/retest

Miciah

Looks great over all. Some very minor comments.

Miciah · 2021-03-16T19:36:46Z

-func TestDesiredCanaryDaemonSet(t *testing.T) {
-	canaryImage := "openshift/hello-openshift:latest"
+// canaryImageName is the ingress-operator image
+var canaryImageName = "openshift/origin-cluster-ingress-operator:latest"


Suggested change

var canaryImageName = "openshift/origin-cluster-ingress-operator:latest"

const canaryImageName = "openshift/origin-cluster-ingress-operator:latest"

Moved it to local variable in the only function it is used.

Miciah · 2021-03-16T19:53:56Z

Oh, one question. The PR's description has the following

Add CANARY_COMMAND, set at runtime to the new subcommand

Was that supposed to be an environment variable? I don't see that change.

candita · 2021-03-17T14:22:57Z

pods should never transition back to pending
/retest

candita · 2021-03-17T14:48:53Z

Oh, one question. The PR's description has the following

Add CANARY_COMMAND, set at runtime to the new subcommand

Was that supposed to be an environment variable? I don't see that change.

No, it is an artifact of a different design. I will remove that from the description.

candita · 2021-03-17T20:17:06Z

failed to acquire lease
/retest

Miciah · 2021-03-17T21:15:34Z

+	if err == nil {
+		fmt.Println("Servicing request.")
+	} else {
+		log.Error(err, "could not serve health check")


Hm, if we can use the logger for errors, can we use it also (log.Info) for success?

Changed to use fmt.Println for both to be consistent with a web server pattern.

…r canary server

candita · 2021-03-18T18:39:59Z

E0318 16:32:18.173056 25 portforward.go:400] an error occurred forwarding 37587 -> 80: error forwarding port 80 to pod 2dbbd646c9283c765567808154bf8078e306d5411aa5d9f7f5e3bb7fd208b598, uid : port forward into network namespace "/var/run/netns/0025377f-b51d-4534-8602-cced2e88d979": read tcp 127.0.0.1:45086->127.0.0.1:80: read: connection reset by peer

/test e2e-aws-operator

candita · 2021-03-18T23:22:45Z

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 18 17:18:03.551: API "kubernetes-api-available-reused-connections" was unreachable during disruption for at least 22s of 1h4m12s (1%):
/retest

candita · 2021-03-19T17:06:35Z

error: failed to push image registry.build02.ci.openshift.org/ci-op-8sfgxk35/release:latest: unable to upload the new image manifest: received unexpected HTTP status: 500 Internal Server Error
{"component":"entrypoint","error":"wrapped process failed: exit status 1","file":"prow/entrypoint/run.go:80","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2021-03-18T23:29:15Z"}
/retest

Miciah · 2021-03-19T17:12:30Z

/lgtm

openshift-ci-robot · 2021-03-19T17:12:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: candita, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [Miciah,candita]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2021-03-21T02:41:37Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-03-21T04:25:37Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-03-21T07:01:37Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2021-03-21T07:14:37Z

/retest

Please review the full test history for this PR and help us cut down flakes.

All consumers have been moved to other images: openshift/cluster-ingress-operator#561 openshift/console#8459

ART hasn't been building these images since 4.8's a6286e7 (rm hello-openshift, 2021-03-29), with the router/ingress folks pivoting away in openshift/cluster-ingress-operator@8332af2f56 (NE 484: Use ingress-operator subcommand instead of hello-openshift for canary server, 2021-03-18, openshift/cluster-ingress-operator#561). Samples had been installing a hello-openshift ImageStream, but they also got away from that in 2021 with openshift/cluster-samples-operator@d94ad97e49 (delete hello-openshift in payload imagestream via CVO annotation, 2021-06-24, openshift/cluster-samples-operator#380). Origin hasn't entirely gotten out of the hello-openshift business yet [1], but regardless of whether they still have an interest, neither ART nor ingress is involved anymore, and it's been a long time since 4.7 went end-of-life: $ curl -s 'https://access.redhat.com/product-life-cycles/api/v1/products?name=Openshift+Container+Platform+4' | jq -r '.data[].versions[] | select(.name == "4.7").phases[] | .date + " " + .name' 2021-02-24T00:00:00.000Z General availability 2021-10-27T00:00:00.000Z Full support 2022-08-24T00:00:00.000Z Maintenance support N/A Extended update support N/A Extended update support Term 2 N/A Extended life phase So I'm dropping those lines from the mapping. [1]: https://github.com/openshift/origin/tree/94a4d2a6f202f64f89a1b747b44484c358f8299d/examples/hello-openshift

openshift-ci-robot requested review from danehans and knobunc February 25, 2021 21:58

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 25, 2021

sgreene570 reviewed Mar 1, 2021

View reviewed changes

Comment thread cmd/ingress-operator/start.go

Comment thread pkg/operator/controller/canary/controller.go Outdated

Comment thread pkg/operator/operator.go Outdated

Comment thread pkg/operator/controller/canary/http.go

Comment thread manifests/02-deployment.yaml Outdated

sgreene570 reviewed Mar 1, 2021

View reviewed changes

Comment thread .gitignore Outdated

candita changed the title ~~NE 484: Use ingress-operator subcommand instead of hello-openshift~~ [WIP] NE 484: Use ingress-operator subcommand instead of hello-openshift Mar 3, 2021

openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 3, 2021

candita force-pushed the NE-484-ChangeCanaryCommand branch 2 times, most recently from 718c8f3 to e266bc4 Compare March 5, 2021 18:01

openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2021

sgreene570 reviewed Mar 5, 2021

View reviewed changes

candita force-pushed the NE-484-ChangeCanaryCommand branch 2 times, most recently from fae1c1a to a4e07bc Compare March 5, 2021 23:58

candita force-pushed the NE-484-ChangeCanaryCommand branch from a4e07bc to 24441ba Compare March 8, 2021 16:25

candita force-pushed the NE-484-ChangeCanaryCommand branch 3 times, most recently from fbf5afe to b96e462 Compare March 8, 2021 22:41

sgreene570 reviewed Mar 10, 2021

View reviewed changes

Comment thread pkg/operator/controller/canary/daemonset.go

candita force-pushed the NE-484-ChangeCanaryCommand branch from b96e462 to 7edb66d Compare March 10, 2021 17:54

candita changed the title ~~[WIP] NE 484: Use ingress-operator subcommand instead of hello-openshift~~ NE 484: Use ingress-operator subcommand instead of hello-openshift Mar 16, 2021

openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2021

Miciah reviewed Mar 16, 2021

View reviewed changes

candita force-pushed the NE-484-ChangeCanaryCommand branch from 7edb66d to 63bbc10 Compare March 17, 2021 17:29

Miciah reviewed Mar 17, 2021

View reviewed changes

NE 484: Use ingress-operator subcommand instead of hello-openshift fo…

8332af2

…r canary server

candita force-pushed the NE-484-ChangeCanaryCommand branch from 63bbc10 to 8332af2 Compare March 18, 2021 15:27

openshift-ci-robot assigned Miciah Mar 19, 2021

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 19, 2021

openshift-merge-robot merged commit 7d8cc57 into openshift:master Mar 21, 2021

yselkowitz added a commit to yselkowitz/cluster-samples-operator that referenced this pull request Mar 26, 2021

Remove hello-openshift imagestream

86fd0f8

All consumers have been moved to other images: openshift/cluster-ingress-operator#561 openshift/console#8459

candita mentioned this pull request Jun 23, 2021

[release-4.7] Bug 1973983: Canary: Handle downgrades from 4.8 to 4.7 properly #627

Merged

wking mentioned this pull request May 27, 2025

product: Drop ose-hello-openshift-container mapping openshift-eng/ocp-build-data#6829

Merged

	var canaryImageName = "openshift/origin-cluster-ingress-operator:latest"
	const canaryImageName = "openshift/origin-cluster-ingress-operator:latest"

Conversation

candita commented Feb 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

candita commented Feb 25, 2021

Uh oh!

candita commented Feb 26, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

candita commented Mar 8, 2021

Uh oh!

candita commented Mar 8, 2021

Uh oh!

candita commented Mar 9, 2021

Uh oh!

candita commented Mar 9, 2021

Uh oh!

candita commented Mar 10, 2021

Uh oh!

Uh oh!

candita commented Mar 10, 2021

Uh oh!

candita commented Mar 10, 2021

Uh oh!

openshift-ci-robot commented Mar 10, 2021

Uh oh!

candita commented Mar 15, 2021

Uh oh!

candita commented Mar 16, 2021

Uh oh!

candita commented Mar 16, 2021

Uh oh!

Miciah left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Miciah Mar 16, 2021

Choose a reason for hiding this comment

Uh oh!

candita Mar 17, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Miciah commented Mar 16, 2021

Uh oh!

candita commented Mar 17, 2021

Uh oh!

candita commented Mar 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

candita commented Mar 17, 2021

Uh oh!

Miciah Mar 17, 2021

Choose a reason for hiding this comment

Uh oh!

candita Mar 18, 2021

Choose a reason for hiding this comment

Uh oh!

candita commented Mar 18, 2021

Uh oh!

candita commented Mar 18, 2021

candita commented Feb 25, 2021 •

edited

Loading

candita commented Mar 17, 2021 •

edited

Loading