RWKV4neo 

### Model description

RWKV - Receptance Weighted Key Value

RWKV is a Sequence to Sequence Model that takes the best features of Generative PreTraining (GPT) and Recurrent Nueral Networks (RNN) that performs Language Modelling (LM). This is used to generate text Auto Regressive manner (AR).

This is a hybrid model.

It has Transformer Level Performance without the quadratic attention mechanism. It borrows ideas from Attention Free Transformers, meaning the attention is a linear in complexity. Allowing for infinite context through the hidden state in RWKV_RNN.

There are two models for RWKV, they are refered to as modes.

RWKV_RNN: This mode is designed for running inference quickly.
RWKV_GPT: This mode is for training or fine tuning your model quickly. 

In the first pass we will be implementing RWKV_RNN Although we can weight share to have RWKV_GPT generate the inital context for RWKV_RNN.


### Open source status

- [X] The model implementation is available
- [X] The model weights are available
- [ ] Scaffolding
- [ ] API Discussion
### Provide useful links for the implementation

More from the Research and Development Repository: https://github.com/BlinkDL/RWKV-LM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RWKV4neo #20737

Model description

Open source status

Provide useful links for the implementation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RWKV4neo #20737

Description

Model description

Open source status

Provide useful links for the implementation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions