Skip to content

katago 1.14.0 TRT plan cache boots significantly slower than 1.13.2 #879

@kinfkong

Description

@kinfkong

Here is the testing environments for comparing the booting time for 1.13.2 and 1.14.0 when hitting the plan cache (That means the plan cache files already exist).

For 1.13.2: using TensorRT 8.5.2
For 1.14.0: using TensorRT 8.6.1
The loading weight is: 18b

In a 5 cards of RTX3080 machine, it takes 40 seconds for 1.14.0 to boot to GTP ready, while 1.13.2 just need 17 seconds
In a 8 cards of RTX4070 machine, it takes 63 seconds for 1.14.0 to boot to GTP ready, while 1.13.2 just need 26 seconds.

I also try with different weights and different machines, 1.14.0 is generally boots much slower than 1.13.2

Have you @lightvector or @hyln9 observed this? Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions