Skip to content

[Serve] limit num_workers in replica's ThreadPoolExecutor to num_cpus #59750

@abrarsheikh

Description

@abrarsheikh

when using asyncio.to_thread in users code, the number of threads utilized will be ThreadPoolExecutor default which is ~num cpus on the machine, maybe its better to limit this num_cpus declared for the deployment to limit over subscription

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions