Refactor: move `load_hparams` and `load_tensors` to per-model definition

### Background Description

Currently, `load_hparams` and `load_tensors` are long switch..case that can be moved to specific model file under `src/models/*.cpp`

The benefit would be to make model definition to be more self-contained, improving the developer experience.

### Possible Refactor Approaches

I'm currently working on this idea on my fork: https://github.com/ngxson/llama.cpp/tree/xsn/model_def_self_contained

The goals are:
1. Making least changes in the code as possible --> prefer moving code around, but prevent directly edit it
2. Possible to auto-migrate via a script, making it easier to resolve git conflicts if any

For the actual implementation, the idea is to expose these functions in `llama_model`:

```cpp
    // model must define these
    virtual void load_hparams(llama_model_loader & ml) = 0;
    virtual void load_tensors(llama_model_loader & ml) = 0;
    virtual std::unique_ptr<llm_graph_context> build_graph_context(const llm_graph_params & params) const = 0;
```

And each model need to define it:

```cpp
struct llama_model_demo : public llama_model {
    llama_model_demo(const struct llama_model_params & params) : llama_model(params) {}
    void load_hparams(llama_model_loader & ml) override {
        hparams.n_layer = 100;
    }
    void load_tensors(llama_model_loader & ml) override {
        output = nullptr;
    }
    struct graph : public llm_graph_context {
        graph(const llama_model & model, const llm_graph_params & params) : llm_graph_context(params) {
            ggml_build_forward_expand(gf, nullptr);
        }
    };
    std::unique_ptr<llm_graph_context> build_graph_context(const llm_graph_params & params) const override {
        return std::make_unique<graph>(*this, params);
    }
};
```

Migration rules:
- Create `llama_model_*` class for each model
- Move `llm_build_ARCH` to `llama_model_ARCH::graph`
- Move code from switch..case to `load_hparams` and `load_tensors`
- Write `llama_model_create()` which is a big switch..case that automatically select the correct `llama_model_ARCH`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: move `load_hparams` and `load_tensors` to per-model definition #21966

Background Description

Possible Refactor Approaches

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Refactor: move load_hparams and load_tensors to per-model definition #21966

Description

Background Description

Possible Refactor Approaches

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Refactor: move `load_hparams` and `load_tensors` to per-model definition #21966