Background Description
Currently, load_hparams and load_tensors are long switch..case that can be moved to specific model file under src/models/*.cpp
The benefit would be to make model definition to be more self-contained, improving the developer experience.
Possible Refactor Approaches
I'm currently working on this idea on my fork: https://github.com/ngxson/llama.cpp/tree/xsn/model_def_self_contained
The goals are:
- Making least changes in the code as possible --> prefer moving code around, but prevent directly edit it
- Possible to auto-migrate via a script, making it easier to resolve git conflicts if any
For the actual implementation, the idea is to expose these functions in llama_model:
// model must define these
virtual void load_hparams(llama_model_loader & ml) = 0;
virtual void load_tensors(llama_model_loader & ml) = 0;
virtual std::unique_ptr<llm_graph_context> build_graph_context(const llm_graph_params & params) const = 0;
And each model need to define it:
struct llama_model_demo : public llama_model {
llama_model_demo(const struct llama_model_params & params) : llama_model(params) {}
void load_hparams(llama_model_loader & ml) override {
hparams.n_layer = 100;
}
void load_tensors(llama_model_loader & ml) override {
output = nullptr;
}
struct graph : public llm_graph_context {
graph(const llama_model & model, const llm_graph_params & params) : llm_graph_context(params) {
ggml_build_forward_expand(gf, nullptr);
}
};
std::unique_ptr<llm_graph_context> build_graph_context(const llm_graph_params & params) const override {
return std::make_unique<graph>(*this, params);
}
};
Migration rules:
- Create
llama_model_* class for each model
- Move
llm_build_ARCH to llama_model_ARCH::graph
- Move code from switch..case to
load_hparams and load_tensors
- Write
llama_model_create() which is a big switch..case that automatically select the correct llama_model_ARCH
Background Description
Currently,
load_hparamsandload_tensorsare long switch..case that can be moved to specific model file undersrc/models/*.cppThe benefit would be to make model definition to be more self-contained, improving the developer experience.
Possible Refactor Approaches
I'm currently working on this idea on my fork: https://github.com/ngxson/llama.cpp/tree/xsn/model_def_self_contained
The goals are:
For the actual implementation, the idea is to expose these functions in
llama_model:And each model need to define it:
Migration rules:
llama_model_*class for each modelllm_build_ARCHtollama_model_ARCH::graphload_hparamsandload_tensorsllama_model_create()which is a big switch..case that automatically select the correctllama_model_ARCH