Modify all health checks to be specified via enums#2078
Merged
Conversation
The set of health checks to be executed were dependent on a combination of check enums and boolean options. This change modifies the health checks to be governed strictly by a set of enums. Next steps: - tightly couple category IDs to names - tightly couple checks to their parent categories - programmatic control over check ordering Signed-off-by: Andrew Seigner <siggy@buoyant.io>
Signed-off-by: Andrew Seigner <siggy@buoyant.io>
siggy
added a commit
that referenced
this pull request
Jan 14, 2019
The `linkerd check` command organized the various checks via loosely coupled category IDs, category names, and checkers themselves, all with ordering defined by consumers of this code. This change removes category IDs in favor of category names, groups all checkers by category, and enforces ordering at the `HealthChecker` level. Part of #1471, depends on #2078. Signed-off-by: Andrew Seigner <siggy@buoyant.io>
klingerf
approved these changes
Jan 15, 2019
Contributor
klingerf
left a comment
There was a problem hiding this comment.
⭐️ This is great! Much more easy to reason about now that those boolean variables are gone.
|
|
||
| // TODO: refactor with LinkerdPreInstallSingleNamespaceChecks | ||
| roleType := "ClusterRole" | ||
| roleBindingType := "ClusterRoleBinding" |
Contributor
There was a problem hiding this comment.
Now that's you've split the RBAC checks into multiple separate methods, I think it's clearer to hardcode everything, rather than worrying about code reuse. I'm inclined to just remove these local vars. Something like:
diff --git a/pkg/healthcheck/healthcheck.go b/pkg/healthcheck/healthcheck.go
index 24b9722e..31c99ce2 100644
--- a/pkg/healthcheck/healthcheck.go
+++ b/pkg/healthcheck/healthcheck.go
@@ -316,23 +316,19 @@ func (hc *HealthChecker) addLinkerdPreInstallClusterChecks() {
},
})
- // TODO: refactor with LinkerdPreInstallSingleNamespaceChecks
- roleType := "ClusterRole"
- roleBindingType := "ClusterRoleBinding"
-
hc.checkers = append(hc.checkers, &checker{
category: LinkerdPreInstallClusterCategory,
- description: fmt.Sprintf("can create %ss", roleType),
+ description: "can create ClusterRoles",
check: func() error {
- return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", roleType)
+ return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", "ClusterRole")
},
})
hc.checkers = append(hc.checkers, &checker{
category: LinkerdPreInstallClusterCategory,
- description: fmt.Sprintf("can create %ss", roleBindingType),
+ description: "can create ClusterRoleBindings",
check: func() error {
- return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", roleBindingType)
+ return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", "ClusterRoleBinding")
},
})
Same goes for the checks in the addLinkerdPreInstallSingleNamespaceChecks func.
Member
Author
There was a problem hiding this comment.
heh, i did exactly that in the next PR: https://github.com/linkerd/linkerd2/pull/2080/files#diff-d4056ff163bcf2aeacefb2a34164563cR270
siggy
added a commit
that referenced
this pull request
Jan 15, 2019
The linkerd check command organized the various checks via loosely coupled category IDs, category names, and checkers themselves, all with ordering defined by consumers of this code. This change removes category IDs in favor of category names, groups all checkers by category, and enforces ordering at the HealthChecker level. Part of #1471, depends on #2078. Signed-off-by: Andrew Seigner <siggy@buoyant.io>
siggy
added a commit
that referenced
this pull request
Jan 15, 2019
The linkerd check command organized the various checks via loosely coupled category IDs, category names, and checkers themselves, all with ordering defined by consumers of this code. This change removes category IDs in favor of category names, groups all checkers by category, and enforces ordering at the HealthChecker level. Part of #1471, depends on #2078. Signed-off-by: Andrew Seigner <siggy@buoyant.io>
hawkw
added a commit
that referenced
this pull request
Aug 3, 2023
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see #11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: #11055 (comment) --- * Increase HTTP request queue capacity (linkerd/linkerd2-proxy#2449) Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Merged
hawkw
added a commit
that referenced
this pull request
Aug 9, 2023
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see #11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: #11055 (comment) --- * Increase HTTP request queue capacity (linkerd/linkerd2-proxy#2449)
hawkw
added a commit
that referenced
this pull request
Aug 9, 2023
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see #11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: #11055 (comment) --- * Increase HTTP request queue capacity (linkerd/linkerd2-proxy#2449)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Modify all health checks to be specified via enums
The set of health checks to be executed were dependent on a combination
of check enums and boolean options.
This change modifies the health checks to be governed strictly by a set
of enums. This change does not add or remove any checks, but rather
moves checks into more granular categories, such that any set of checks
that are toggle-able are defined together under a single category.
This is a first step in cleaning up the
linkerd checkcode, and moving towards #1471.Next steps:
Signed-off-by: Andrew Seigner siggy@buoyant.io