Skip to content

NE 484: Use ingress-operator subcommand instead of hello-openshift #561

Merged
openshift-merge-robot merged 1 commit into
openshift:masterfrom
candita:NE-484-ChangeCanaryCommand
Mar 21, 2021
Merged

NE 484: Use ingress-operator subcommand instead of hello-openshift #561
openshift-merge-robot merged 1 commit into
openshift:masterfrom
candita:NE-484-ChangeCanaryCommand

Conversation

@candita

@candita candita commented Feb 25, 2021

Copy link
Copy Markdown
Contributor

NE 484: Use ingress-operator subcommand instead of hello-openshift for canary server

  • Add new subcommand serve-healthcheck
  • Add servehealthcheck code, same functionality as hello-openshift
  • Change CANARY_IMAGE to openshift/origin-cluster-ingress-operator:latest
  • Add a check for Canary Image when computing the operator progressing condition
  • Update tests

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 25, 2021
@candita

candita commented Feb 25, 2021

Copy link
Copy Markdown
Contributor Author

/retest

@candita

candita commented Feb 26, 2021

Copy link
Copy Markdown
Contributor Author

Operator unavailable (OAuthServiceCheckEndpointAccessibleController_EndpointUnavailable::OAuthServiceEndpointsCheckEndpointAccessibleController_EndpointUnavailable::OAuthVersionDeployment_MissingDeployment::ReadyIngressNodes_NoReadyIngressNodes::WellKnown_NotReady): OAuthServiceEndpointsCheckEndpointAccessibleControllerAvailable: Failed to get oauth-openshift enpoints
ReadyIngressNodesAvailable: Authentication requires functional ingress which requires at least one schedulable and ready node. Got 0 worker nodes, 3 master nodes, 0 custom target nodes (none are schedulable or ready for ingress pods).
OAuthServiceCheckEndpointAccessibleControllerAvailable: Get "https://172.30.71.136:443/healthz": dial tcp 172.30.71.136:443: connect: connection refused
WellKnownAvailable: The well-known endpoint is not yet available: failed to get oauth metadata from openshift-config-managed/oauth-openshift ConfigMap: configmap "oauth-openshift" not found (check authentication operator, it is supposed to create this)

/retest

Comment thread cmd/ingress-operator/start.go
Comment thread pkg/operator/controller/canary/controller.go Outdated
Comment thread pkg/operator/operator.go Outdated
Comment thread pkg/operator/controller/canary/http.go
Comment thread manifests/02-deployment.yaml Outdated
Comment thread .gitignore Outdated
@candita candita changed the title NE 484: Use ingress-operator subcommand instead of hello-openshift [WIP] NE 484: Use ingress-operator subcommand instead of hello-openshift Mar 3, 2021
@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 3, 2021
@candita candita force-pushed the NE-484-ChangeCanaryCommand branch 2 times, most recently from 718c8f3 to e266bc4 Compare March 5, 2021 18:01
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 5, 2021
Comment thread pkg/operator/controller/names.go Outdated
Comment thread pkg/operator/controller/canary/daemonset_test.go Outdated
Comment thread cmd/ingress-operator/start.go Outdated
Comment thread pkg/operator/config/config.go Outdated
Comment thread assets/canary/daemonset.yaml
Comment thread pkg/operator/controller/canary/controller.go Outdated
Comment thread pkg/operator/controller/status/controller.go Outdated
@candita candita force-pushed the NE-484-ChangeCanaryCommand branch 2 times, most recently from fae1c1a to a4e07bc Compare March 5, 2021 23:58
@candita

candita commented Mar 8, 2021

Copy link
Copy Markdown
Contributor Author

/retest

@candita candita force-pushed the NE-484-ChangeCanaryCommand branch from a4e07bc to 24441ba Compare March 8, 2021 16:25
@candita

candita commented Mar 8, 2021

Copy link
Copy Markdown
Contributor Author

/retest

@candita candita force-pushed the NE-484-ChangeCanaryCommand branch 3 times, most recently from fbf5afe to b96e462 Compare March 8, 2021 22:41
@candita

candita commented Mar 9, 2021

Copy link
Copy Markdown
Contributor Author

fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination: ... Message:"Previous pod kube-apiserver-ip-10-0-236-113.ec2.internal started at 2021-03-08 23:10:05.437889569 +0000 UTC did not terminate gracefully",... Probably kubelet or CRI-O is not giving the time to cleanly shut down. This can lead to connection refused and network I/O timeout errors in other components.
/retest

@candita

candita commented Mar 9, 2021

Copy link
Copy Markdown
Contributor Author

Observed 1.0416666666666667 leader changes in 22m55s: Leader changes are a result of stopping the etcd leader process or from latency (disk or network), review etcd performance metrics
occurred
/retest

@candita

candita commented Mar 10, 2021

Copy link
Copy Markdown
Contributor Author

fail [github.com/onsi/ginkgo@v4.7.0-origin.0+incompatible/internal/leafnodes/runner.go:64]: kube-apiserver reports a non-graceful termination
/retest

Comment thread pkg/operator/controller/canary/daemonset.go
@candita candita force-pushed the NE-484-ChangeCanaryCommand branch from b96e462 to 7edb66d Compare March 10, 2021 17:54
@candita

candita commented Mar 10, 2021

Copy link
Copy Markdown
Contributor Author

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 10 19:27:57.021: API "oauth-api-available-new-connections" was unreachable during disruption for at least 32s of 57m30s (1%):
/retest

@candita

candita commented Mar 10, 2021

Copy link
Copy Markdown
Contributor Author

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 10 21:59:37.344: API "kubernetes-api-available-reused-connections" was unreachable during disruption for at least 44s of 1h0m46s (1%):
/retest e2e-upgrade

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@candita: The /retest command does not accept any targets.
The following commands are available to trigger jobs:

  • /test e2e-aws
  • /test e2e-aws-operator
  • /test e2e-aws-single-node
  • /test e2e-azure
  • /test e2e-azure-operator
  • /test e2e-gcp-operator
  • /test e2e-upgrade
  • /test images
  • /test unit
  • /test verify

Use /test all to run the following jobs:

  • pull-ci-openshift-cluster-ingress-operator-master-e2e-aws
  • pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-operator
  • pull-ci-openshift-cluster-ingress-operator-master-e2e-upgrade
  • pull-ci-openshift-cluster-ingress-operator-master-images
  • pull-ci-openshift-cluster-ingress-operator-master-unit
  • pull-ci-openshift-cluster-ingress-operator-master-verify
Details

In response to this:

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 10 21:59:37.344: API "kubernetes-api-available-reused-connections" was unreachable during disruption for at least 44s of 1h0m46s (1%):
/retest e2e-upgrade

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@candita

candita commented Mar 15, 2021

Copy link
Copy Markdown
Contributor Author

Mar 15 19:52:06.573: INFO: cluster upgrade is Progressing: Unable to apply 4.8.0-0.ci.test-2021-03-15-182041-ci-op-ivhwh5xi: an unknown error has occurred: MultipleErrors
Mar 15 19:52:06.573: INFO: cluster upgrade is Failing: Multiple errors are preventing progress:

  • Could not update clusterrole "cloud-credential-operator-role" (210 of 669)
    ...
    /retest

@candita

candita commented Mar 16, 2021

Copy link
Copy Markdown
Contributor Author

Mar 15 23:32:05.850: FAIL: API "oauth-api-available-new-connections" was unreachable during disruption for at least 3s of 1h3m21s (0%):
/retest

@candita

candita commented Mar 16, 2021

Copy link
Copy Markdown
Contributor Author

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 16 15:44:32.024: API "openshift-api-available-new-connections" was unreachable during disruption for at least 13s of 1h3m26s (0%):
/retest

@candita candita changed the title [WIP] NE 484: Use ingress-operator subcommand instead of hello-openshift NE 484: Use ingress-operator subcommand instead of hello-openshift Mar 16, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 16, 2021

@Miciah Miciah left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great over all. Some very minor comments.

Comment thread cmd/ingress-operator/serve-healthcheck.go Outdated
Comment thread assets/canary/daemonset.yaml
Comment thread cmd/ingress-operator/serve-healthcheck.go Outdated
Comment thread cmd/ingress-operator/serve-healthcheck.go
func TestDesiredCanaryDaemonSet(t *testing.T) {
canaryImage := "openshift/hello-openshift:latest"
// canaryImageName is the ingress-operator image
var canaryImageName = "openshift/origin-cluster-ingress-operator:latest"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var canaryImageName = "openshift/origin-cluster-ingress-operator:latest"
const canaryImageName = "openshift/origin-cluster-ingress-operator:latest"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it to local variable in the only function it is used.

Comment thread pkg/operator/controller/status/controller_test.go
Comment thread pkg/operator/controller/status/controller_test.go
Comment thread pkg/operator/controller/status/controller_test.go Outdated
Comment thread pkg/operator/controller/canary/http.go
Comment thread pkg/operator/operator.go
@Miciah

Miciah commented Mar 16, 2021

Copy link
Copy Markdown
Contributor

Oh, one question. The PR's description has the following

  • Add CANARY_COMMAND, set at runtime to the new subcommand

Was that supposed to be an environment variable? I don't see that change.

@candita

candita commented Mar 17, 2021

Copy link
Copy Markdown
Contributor Author

pods should never transition back to pending
/retest

@candita

candita commented Mar 17, 2021

Copy link
Copy Markdown
Contributor Author

Oh, one question. The PR's description has the following

  • Add CANARY_COMMAND, set at runtime to the new subcommand

Was that supposed to be an environment variable? I don't see that change.

No, it is an artifact of a different design. I will remove that from the description.

@candita candita force-pushed the NE-484-ChangeCanaryCommand branch from 7edb66d to 63bbc10 Compare March 17, 2021 17:29
@candita

candita commented Mar 17, 2021

Copy link
Copy Markdown
Contributor Author

failed to acquire lease
/retest

if err == nil {
fmt.Println("Servicing request.")
} else {
log.Error(err, "could not serve health check")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, if we can use the logger for errors, can we use it also (log.Info) for success?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use fmt.Println for both to be consistent with a web server pattern.

@candita candita force-pushed the NE-484-ChangeCanaryCommand branch from 63bbc10 to 8332af2 Compare March 18, 2021 15:27
@candita

candita commented Mar 18, 2021

Copy link
Copy Markdown
Contributor Author

E0318 16:32:18.173056 25 portforward.go:400] an error occurred forwarding 37587 -> 80: error forwarding port 80 to pod 2dbbd646c9283c765567808154bf8078e306d5411aa5d9f7f5e3bb7fd208b598, uid : port forward into network namespace "/var/run/netns/0025377f-b51d-4534-8602-cced2e88d979": read tcp 127.0.0.1:45086->127.0.0.1:80: read: connection reset by peer

/test e2e-aws-operator

@candita

candita commented Mar 18, 2021

Copy link
Copy Markdown
Contributor Author

fail [github.com/openshift/origin/test/extended/util/disruption/controlplane/controlplane.go:118]: Mar 18 17:18:03.551: API "kubernetes-api-available-reused-connections" was unreachable during disruption for at least 22s of 1h4m12s (1%):
/retest

@candita

candita commented Mar 19, 2021

Copy link
Copy Markdown
Contributor Author

error: failed to push image registry.build02.ci.openshift.org/ci-op-8sfgxk35/release:latest: unable to upload the new image manifest: received unexpected HTTP status: 500 Internal Server Error
{"component":"entrypoint","error":"wrapped process failed: exit status 1","file":"prow/entrypoint/run.go:80","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2021-03-18T23:29:15Z"}
/retest

@Miciah

Miciah commented Mar 19, 2021

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 19, 2021
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: candita, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot

Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot

Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot

Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot

Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 7d8cc57 into openshift:master Mar 21, 2021
yselkowitz added a commit to yselkowitz/cluster-samples-operator that referenced this pull request Mar 26, 2021
wking added a commit to wking/ocp-build-data that referenced this pull request May 27, 2025
ART hasn't been building these images since 4.8's a6286e7 (rm
hello-openshift, 2021-03-29), with the router/ingress folks pivoting
away in openshift/cluster-ingress-operator@8332af2f56 (NE 484: Use
ingress-operator subcommand instead of hello-openshift for canary
server, 2021-03-18, openshift/cluster-ingress-operator#561).  Samples
had been installing a hello-openshift ImageStream, but they also got
away from that in 2021 with
openshift/cluster-samples-operator@d94ad97e49 (delete hello-openshift
in payload imagestream via CVO annotation, 2021-06-24,
openshift/cluster-samples-operator#380).  Origin hasn't entirely
gotten out of the hello-openshift business yet [1], but regardless of
whether they still have an interest, neither ART nor ingress is
involved anymore, and it's been a long time since 4.7 went
end-of-life:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v1/products?name=Openshift+Container+Platform+4' | jq -r '.data[].versions[] | select(.name == "4.7").phases[] | .date + " " + .name'
2021-02-24T00:00:00.000Z General availability
2021-10-27T00:00:00.000Z Full support
2022-08-24T00:00:00.000Z Maintenance support
N/A Extended update support
N/A Extended update support Term 2
N/A Extended life phase

So I'm dropping those lines from the mapping.

[1]: https://github.com/openshift/origin/tree/94a4d2a6f202f64f89a1b747b44484c358f8299d/examples/hello-openshift
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants