-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
When using gRPC service behind knative service I do not observe the expected load balancing pattern. To reproduce this I used your example https://github.com/knative/docs/tree/main/code-samples/serving/grpc-ping-go
I patched service.yaml to force scaling to 2 replicas
metadata:
annotations:
autoscaling.knative.dev/min-scale: "2"
autoscaling.knative.dev/max-scale: "2"This, as any knative service creates k8s service named grpc-ping. When I use the client to send multiple pings to services behind "DNS" grpc-ping.default:80 all of the ping requests are routed to the same pod. I would expect to observe a "fair load balancing" of the ping requests.
I didn't have time to dive deeper into the source, but what I would expect to be able to do is an "option" to define that my service is indeed a gRPC based. You could then create a headless services for those kinds of "apps".
For example ATM knative controller will will create something like <service-name>-<orderedNum?>-private service to resolve the actual IPs to send the requests to once the pods are up and running. In this case we have grpc-ping-00001-private. When dealing with gRPC, you could additionally create grpc-ping-00002-private-headless and simply have the proxy routing the traffic use gRPCs dns:/// lookup to do the round robin load balancing.
I know there are probably good reasons why you don't want to special case this, but would be a great feature to allow much more control for the users.
What version of Knative?
knative-v1.8.6
Expected Behavior
To observe trafic on both replicas being pinged by the client.
Actual Behavior
All ping requests are routed to the same pod.
Steps to Reproduce the Problem
Use the simple grpc service from official knative docs
https://github.com/knative/docs/tree/main/code-samples/serving/grpc-ping-go
In the changes bellow I use "my" images (built following the readme, but published on my dockerhub)
To replicate the issue change the service.yaml to
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: grpc-ping
namespace: default
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/min-scale: "2"
autoscaling.knative.dev/max-scale: "2"
spec:
containers:
- image: docker.io/lrotim/grpc-ping-go
ports:
- name: h2c
containerPort: 8080And use the following manifest to spin off pods to ping the service
apiVersion: batch/v1
kind: Job
metadata:
name: go-grpc-cilent
namespace: default
spec:
parallelism: 10 # Replace with the number of nodes in your cluster
completions: 10 # Number of pods to successfully complete
template:
metadata:
labels:
app: go-grpc-cilent
spec:
containers:
- name: go-grpc-cilent
image: lrotim/grpc-ping-go
command:
- '/client'
# Add your container image details above
args:
- --insecure
- --skip_verify
- --server_addr=grpc-ping.default:80
resources:
requests:
cpu: "10m"
memory: "50Mi"
restartPolicy: NeverWhen I inspect the logs I see that all traffic was routed to one pod.