Skip to content

ScaledJob is not considering pendingJobCount in the "accurate" strategy calculation #7329

@daleksandrowicz

Description

@daleksandrowicz

Report

Problem

ScaledJob is not considering pendingJobCount in the accurate strategy calculation when the sum of maxScale + runningJobCount is bigger than maxReplicaCount. As the result, when runningJobCount is equal to pendingJobCount, we're getting redundant jobs which are "idle". They are created and immediately finishing processing as there are no messages to be pulled, because they have been already processed by previously created jobs.

The link to the function: https://github.com/kedacore/keda/blob/main/pkg/scaling/executor/scale_jobs.go#L492

func (s accurateScalingStrategy) GetEffectiveMaxScale(maxScale, runningJobCount, pendingJobCount, maxReplicaCount, scaleTo int64) (int64, int64) {
	if (maxScale + runningJobCount) > maxReplicaCount {
		return maxReplicaCount - runningJobCount, scaleTo
	}
	return maxScale - pendingJobCount, scaleTo
}

This has an impact especially when we are scaling a new node and pods are in the pending state for some time, e.g. 3-4 mins. We don't want to increase poll interval time as we want each message to be processed as soon as possible (even if we have eventually to wait a bit on a node creation).

Solution

We can fix it by updating the calculation logic:

func (s accurateScalingStrategy) GetEffectiveMaxScale(maxScale, runningJobCount, pendingJobCount, maxReplicaCount, scaleTo int64) (int64, int64) {
	if (maxScale + runningJobCount - pendingJobCount) > maxReplicaCount {
		return maxReplicaCount - runningJobCount, scaleTo
	}
	return maxScale - pendingJobCount, scaleTo
}

Expected Behavior

1.1. We send 3 messages to the Kafka topic

maxScale = 3
runningJobCount = 0
pendingJobCount = 0
maxReplicaCount = 5

we get 3 jobs, which is correct:
(3 + 0 - 0 = 3) < 5 -=> 3 - 0 = 3

1.2. Then, if the scaler is again polling messages from the queue (still same number of messages as pods are still in the pending state)

maxScale = 3
runningJobCount = 3
pendingJobCount = 3
maxReplicaCount = 5

we get 0, so no more jobs will be scheduled and this is correct:
(3 + 3 - 3 = 3) < 5 -=> 3 - 3 = 0

1.3 When adding new more message to the queue when there are still 3 pending and running jobs

maxScale = 4
runningJobCount = 3
pendingJobCount = 3
maxReplicaCount = 5

we get 1, so it's correct:
(4 + 3 - 3 = 4) < 5 -=> 4 - 3 = 1

1.4. When adding at total 6 messages in the queue still assuming 3 pending and running jobs

maxScale = 6
runningJobCount = 3
pendingJobCount = 3
maxReplicaCount = 5

we get 2, so it's correct, because only two more jobs can be added as we reached the maximum number of 5 jobs:
(6 + 3 - 3 = 6) > 5 -=> 5 - 3 = 2

2. Another case

maxScale = 3
runningJobCount = 4
pendingJobCount = 2
maxReplicaCount = 6

we get 1, so it's correct:
(3 + 4 - 2 = 5) < 6 -=> 3 - 2 = 1

Actual Behavior

1.1. We send 3 messages to the Kafka topic:

maxScale = 3
runningJobCount = 0
pendingJobCount = 0
maxReplicaCount = 5

we get 3 jobs, which is correct:
(3 + 0 = 3) < 5 -=> 3 - 0 = 3

1.2. Then, if the scaler is again polling messages from the queue (still same number of messages as pods are still in the pending state):

maxScale = 3
runningJobCount = 3
pendingJobCount = 3
maxReplicaCount = 5

we get again 2 more jobs, which is incorrect, because they will immediately exit as there are no more messages to be processed (existing messages will be processed by previously created 3 jobs):
(3 + 3 = 6) > 5 -=> 5 - 3 = 2

Steps to Reproduce the Problem

  1. Setup Kafka (e.g. AWS MSK) with a topic (e.g. 3 partitions) and a consumer group to be used by the Kafka scaler
  2. Setup Keda ScaledJob with Kafka scaler (example below)
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
  name: test
spec:
  jobTargetRef:
    parallelism: 1
    completions: 1
    backoffLimit: 3
    template:
      spec:
        containers:
          - name: test
            image: alpine
            command: ['sh', '-c', 'sleep 30']
  pollingInterval: 15
  successfulJobsHistoryLimit: 1
  failedJobsHistoryLimit: 3
  maxReplicaCount: 5
  scalingStrategy:
    strategy: "accurate"
    pendingPodConditions:
      - "Ready"
  triggers:
    - type: kafka
      metadata:
        bootstrapServers: kafka.svc:9092
        consumerGroup: my-group
        topic: test-topic
        lagThreshold: '1'
        offsetResetPolicy: latest
  1. Send messages to Kafka (according to examples)

KEDA Version

2.18.0

Kubernetes Version

1.33

Platform

Microsoft Azure

Scaler Details

Kafka

Would you be open to contributing a fix?

Maybe

Anything else?

I've noticed this issue was already raised by others in these two PRs: #1227, #1391.

This solution was even mentioned in this comment: #1227 (comment)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

Ready To Ship

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions