Bug
When a query job is cancelled, the query scheduler might crashes with a TypeError:
2025-12-18 23:52:23,790 search-job-handler [ERROR] handle_job_updates completed unexpectedly.
2025-12-18 23:52:23,790 search-job-handler [ERROR] handle_job_updates failed.
Traceback (most recent call last):
File "/opt/clp/lib/python3/site-packages/job_orchestration/scheduler/query/query_scheduler.py", line 1137, in handle_jobs
handle_updating_task.result()
File "/opt/clp/lib/python3/site-packages/job_orchestration/scheduler/query/query_scheduler.py", line 1098, in handle_job_updates
await handle_cancelling_search_jobs(db_conn_pool)
File "/opt/clp/lib/python3/site-packages/job_orchestration/scheduler/query/query_scheduler.py", line 372, in handle_cancelling_search_jobs
duration=(datetime.datetime.now() - job.start_time).total_seconds(),
TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'NoneType'
2025-12-18 23:52:23,792 search-job-handler [ERROR] job_handler completed unexpectedly.
The root cause of the bug is following:
- When a query job that contains search and reduce is first handled by query scheduler, a new local
SearchJob is created, with no start_time set, and added to active_jobs. The search job is in waiting for reducer state.
- The
start_time is expected to be set once the corresponding reducer is started, and search job will be ready for dispatch.
- If the job is cancelled before the reducer is created, then the query scheduler will calculate the job duration with the job's
start_time, which is not yet set and still None, triggering a TypeError.
CLP version
9650b8c
Environment
Ubuntu 22.04
Docker 29.0.0
Docker compose 1.29.2
Reproduction steps
sitaowang1998@abb53d3. This is a modified version that adds sleep when creating reducer to trigger the bug more consistently.
Bug
When a query job is cancelled, the query scheduler might crashes with a
TypeError:The root cause of the bug is following:
SearchJobis created, with nostart_timeset, and added toactive_jobs. The search job is in waiting for reducer state.start_timeis expected to be set once the corresponding reducer is started, and search job will be ready for dispatch.start_time, which is not yet set and still None, triggering aTypeError.CLP version
9650b8c
Environment
Ubuntu 22.04
Docker 29.0.0
Docker compose 1.29.2
Reproduction steps
sitaowang1998@abb53d3. This is a modified version that adds sleep when creating reducer to trigger the bug more consistently.