Code for scaling Transformers
Why do you think that https://github.com/BlinkDL/RWKV-LM is a good alternative to outperformer
Code for scaling Transformers
Why do you think that https://github.com/BlinkDL/RWKV-LM is a good alternative to outperformer