Skip to content

gRPC load balancing #14011

@lrotim

Description

@lrotim

When using gRPC service behind knative service I do not observe the expected load balancing pattern. To reproduce this I used your example https://github.com/knative/docs/tree/main/code-samples/serving/grpc-ping-go

I patched service.yaml to force scaling to 2 replicas

    metadata:
      annotations:
        autoscaling.knative.dev/min-scale: "2"
        autoscaling.knative.dev/max-scale: "2"

This, as any knative service creates k8s service named grpc-ping. When I use the client to send multiple pings to services behind "DNS" grpc-ping.default:80 all of the ping requests are routed to the same pod. I would expect to observe a "fair load balancing" of the ping requests.

I didn't have time to dive deeper into the source, but what I would expect to be able to do is an "option" to define that my service is indeed a gRPC based. You could then create a headless services for those kinds of "apps".
For example ATM knative controller will will create something like <service-name>-<orderedNum?>-private service to resolve the actual IPs to send the requests to once the pods are up and running. In this case we have grpc-ping-00001-private. When dealing with gRPC, you could additionally create grpc-ping-00002-private-headless and simply have the proxy routing the traffic use gRPCs dns:/// lookup to do the round robin load balancing.

I know there are probably good reasons why you don't want to special case this, but would be a great feature to allow much more control for the users.

What version of Knative?

knative-v1.8.6

Expected Behavior

To observe trafic on both replicas being pinged by the client.

Actual Behavior

All ping requests are routed to the same pod.

Steps to Reproduce the Problem

Use the simple grpc service from official knative docs
https://github.com/knative/docs/tree/main/code-samples/serving/grpc-ping-go

In the changes bellow I use "my" images (built following the readme, but published on my dockerhub)

To replicate the issue change the service.yaml to

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: grpc-ping
  namespace: default
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/min-scale: "2"
        autoscaling.knative.dev/max-scale: "2"
    spec:
      containers:
      - image: docker.io/lrotim/grpc-ping-go
        ports:
          - name: h2c
            containerPort: 8080

And use the following manifest to spin off pods to ping the service

apiVersion: batch/v1
kind: Job
metadata:
  name: go-grpc-cilent
  namespace: default
spec:
  parallelism: 10  # Replace with the number of nodes in your cluster
  completions: 10  # Number of pods to successfully complete
  template:
    metadata:
      labels:
        app: go-grpc-cilent
    spec:
      containers:
      - name: go-grpc-cilent
        image: lrotim/grpc-ping-go
        command:
        - '/client'
        # Add your container image details above
        args:
          - --insecure
          - --skip_verify
          - --server_addr=grpc-ping.default:80
        resources:
          requests:
            cpu: "10m"
            memory: "50Mi"
      restartPolicy: Never

When I inspect the logs I see that all traffic was routed to one pod.

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.triage/acceptedIssues which should be fixed (post-triage)

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions