Skip to content

Loading model weights more efficiently #119

@kerthcet

Description

@kerthcet

What would you like to be added:

Right now we can download model weights from model hub directly, but each time we start/restart a pod, it will downloading the model weights again. Without the loading accelerators like fluid or dragonfly, we should think of a way to tackle this more efficiently, let's focus on three things:

  • download the models the first time should be as quick as possible
  • don't need to download the model weights again when pod restarted
  • handle the model cache efficiently

Why is this needed:

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

Metadata

Metadata

Assignees

Labels

featureCategorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions