Skip to content

Refactor: move load_hparams and load_tensors to per-model definition #21966

@ngxson

Description

@ngxson

Background Description

Currently, load_hparams and load_tensors are long switch..case that can be moved to specific model file under src/models/*.cpp

The benefit would be to make model definition to be more self-contained, improving the developer experience.

Possible Refactor Approaches

I'm currently working on this idea on my fork: https://github.com/ngxson/llama.cpp/tree/xsn/model_def_self_contained

The goals are:

  1. Making least changes in the code as possible --> prefer moving code around, but prevent directly edit it
  2. Possible to auto-migrate via a script, making it easier to resolve git conflicts if any

For the actual implementation, the idea is to expose these functions in llama_model:

    // model must define these
    virtual void load_hparams(llama_model_loader & ml) = 0;
    virtual void load_tensors(llama_model_loader & ml) = 0;
    virtual std::unique_ptr<llm_graph_context> build_graph_context(const llm_graph_params & params) const = 0;

And each model need to define it:

struct llama_model_demo : public llama_model {
    llama_model_demo(const struct llama_model_params & params) : llama_model(params) {}
    void load_hparams(llama_model_loader & ml) override {
        hparams.n_layer = 100;
    }
    void load_tensors(llama_model_loader & ml) override {
        output = nullptr;
    }
    struct graph : public llm_graph_context {
        graph(const llama_model & model, const llm_graph_params & params) : llm_graph_context(params) {
            ggml_build_forward_expand(gf, nullptr);
        }
    };
    std::unique_ptr<llm_graph_context> build_graph_context(const llm_graph_params & params) const override {
        return std::make_unique<graph>(*this, params);
    }
};

Migration rules:

  • Create llama_model_* class for each model
  • Move llm_build_ARCH to llama_model_ARCH::graph
  • Move code from switch..case to load_hparams and load_tensors
  • Write llama_model_create() which is a big switch..case that automatically select the correct llama_model_ARCH

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions