Skip to content

Feature Request: Support Maincoder Architecture #18346

@ThatGuyWhoAsked

Description

@ThatGuyWhoAsked

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Running b7530
https://huggingface.co/Maincode/Maincoder-1B
https://www.reddit.com/r/LocalLLaMA/comments/1puf614/new_1b_parameter_opensource_coding_model_getting/
As an enhancement, I would expect llama.cpp to support the architecture for running Maincoder models.

Motivation

I believe that the addition of the Maincoder model architecture would be a very helpful addition to llama.cpp.

  1. The current model is a 1b, which scores very well on benchmarks for its size.
  2. It's Ideal for ultra-low latency Fill-In-The-Middle (FIM) and local IDE completion on any hardware.
  3. It Offers high-quality QA and coding assistance (For it's size) at a size that runs smoothly on CPUs and mobile devices.
  4. Can run locally or on constrained hardware
Image

Possible Implementation

https://huggingface.co/Maincode/Maincoder-1B
Maincoder uses a modern transformer decoder architecture with:

Rotary Position Embeddings: With theta of 1,000,000.
RMSNorm: Pre-normalization for stable training.
Grouped Query Attention: 4:1 ratio of query to key-value heads.
QK Normalization: RMSNorm applied to attention queries and keys.
SwiGLU MLP: Gated linear units with SiLU activation.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions