Model description
RWKV - Receptance Weighted Key Value
RWKV is a Sequence to Sequence Model that takes the best features of Generative PreTraining (GPT) and Recurrent Nueral Networks (RNN) that performs Language Modelling (LM). This is used to generate text Auto Regressive manner (AR).
This is a hybrid model.
It has Transformer Level Performance without the quadratic attention mechanism. It borrows ideas from Attention Free Transformers, meaning the attention is a linear in complexity. Allowing for infinite context through the hidden state in RWKV_RNN.
There are two models for RWKV, they are refered to as modes.
RWKV_RNN: This mode is designed for running inference quickly.
RWKV_GPT: This mode is for training or fine tuning your model quickly.
In the first pass we will be implementing RWKV_RNN Although we can weight share to have RWKV_GPT generate the inital context for RWKV_RNN.
Open source status
Provide useful links for the implementation
More from the Research and Development Repository: https://github.com/BlinkDL/RWKV-LM
Model description
RWKV - Receptance Weighted Key Value
RWKV is a Sequence to Sequence Model that takes the best features of Generative PreTraining (GPT) and Recurrent Nueral Networks (RNN) that performs Language Modelling (LM). This is used to generate text Auto Regressive manner (AR).
This is a hybrid model.
It has Transformer Level Performance without the quadratic attention mechanism. It borrows ideas from Attention Free Transformers, meaning the attention is a linear in complexity. Allowing for infinite context through the hidden state in RWKV_RNN.
There are two models for RWKV, they are refered to as modes.
RWKV_RNN: This mode is designed for running inference quickly.
RWKV_GPT: This mode is for training or fine tuning your model quickly.
In the first pass we will be implementing RWKV_RNN Although we can weight share to have RWKV_GPT generate the inital context for RWKV_RNN.
Open source status
Provide useful links for the implementation
More from the Research and Development Repository: https://github.com/BlinkDL/RWKV-LM