Reconcile slow cause deleted pod never recreate

### What happened?

There is one scenario which can cause pod leak.

pclq has 3 replicas (pod1, pod2, pod3)
1. T0: While pod1 is pending may due to resource limit or other issue, use kubectl delete pod manually. Pod1 is deleted successfully.
2. T1: Informer cache updated (pod2, pod3)
3. T2: PCLQ controller reconcile.
`diff := len(sc.existingPCLQPods) + len(createExpectations) - int(sc.pclq.Spec.Replicas) - len(deleteExpectations) // diff = 2 + 1 - 3 - 0 = 0`

In this case, pod1 will never be recreate. The reason here is that reconcile is slow in some condition and informer cache update before reconcile. `SyncExpectations` works on the assumption that some pods is in terminating or let's say pod delete is slow than Reconcile. We couldn't tell these two scenarios:

1. informer can't see the created pod 
2. Pod is already deleted





### What did you expect to happen?

Pod1 should be recreated.

### Environment

- Kubernetes version
- Grove version
- Scheduler details
- Cloud provider or hardware configuration
- Tools that you are using Grove together with
- Anything else that is relevant


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconcile slow cause deleted pod never recreate #457

What happened?

What did you expect to happen?

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Reconcile slow cause deleted pod never recreate #457

Description

What happened?

What did you expect to happen?

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions