clp-package: Scheduling jitter is much larger than the scheduling interval, resulting in slow search.

### Bug

Our query scheduler is currently designed to run a scheduling loop on a fixed polling interval, where on each interval we gather finished batches of tasks and dispatch new batches. This design has some fundamental problems, but that will be addressed in a separate design doc going over a proper solution.

The immediate issue, then, is that this scheduling loop currently has so much jitter that the effective rate at which we can dispatch new batches of task for each job is regulated by this jitter instead of the configured polling interval.

The following chart shows the breakdown of how time is spent in the main part of the scheduling loop over a small illustrative time slice of a longer search job as batches of task are completed, and illustrates how task completion affects jitter. (Apologies for the ad-hoc chart).

<img width="800" height="600" alt="Image" src="https://github.com/user-attachments/assets/543d50c8-6560-43c6-929e-e280277300cc" />

As you can see from the graph, each time a batch of tasks is completed we spend a suspicious amount of time waiting for results from celery.

As it turns out, this is because even after `task.ready()` is true, celery still seems to go into a polling loop to retrieve results from redis -- since the default polling interval is 0.5s we seem to always experience this 0.5s delay when retrieving results. Reducing this polling interval directly reduces the time we spend in `task.get()` (and experimentally even when reduced significantly we seem to always spend ~`polling_interval` time in `get()`).

Besides this issue with retrieving the results of celery tasks, we can also reduce jitter by changing how we sleep in our main scheduling loop.

Currently the loop looks something like
```
while True:
    # do stuff
    await asyncio.sleep(polling_interval)
```
but to reduce jitter we really want something like
```
while True:
    # do stuff
    await asyncio.sleep(polling_interval - time_spent_doing_stuff)
```

### CLP version

0.8.0

### Environment

Package build started with docker-compose.

### Reproduction steps

1. Compress enough data to form at least a few archives
2. Make sure the configured batch size is less than the total number of archives
3. Dispatch any search across all archives (note that because of another issue, command line searches that don't invoke the reducer end up with all tasks in a single batch).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clp-package: Scheduling jitter is much larger than the scheduling interval, resulting in slow search. #1897

Bug

CLP version

Environment

Reproduction steps

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

clp-package: Scheduling jitter is much larger than the scheduling interval, resulting in slow search. #1897

Description

Bug

CLP version

Environment

Reproduction steps

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions