Skip to content

Tuning job timeout not firing #464

@ericbuckley

Description

@ericbuckley

Summary

The tuning calculation is launched as an asynchronous task with a timeout using asyncio. The timeout is designed to safeguard against long-running or potentially infinite loops that would otherwise prevent the task from completing. However, because the tune function is CPU-bound and does not yield control back to the event loop (e.g., via await or cooperative scheduling), the event loop is unable to enforce the timeout effectively. As a result, the task cannot be cancelled cleanly once the timeout is reached.

Impact

Long-running and/or infinite loops could eat up all the CPU resources in an environment that would force IT teams to restart the servers.

Steps to reproduce

  1. Set the TUNING_JOB_TIMEOUT value to 1 sec
  2. Run a normal tuning job (default parameters) on a medium to large database (>100,000 patient records)
  3. Job finishes in normal time

Expected behavior

The job should be canceled after about 1 second and the appropriate failure message recorded in the tuning_job table.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions