It should give you an idea of how hard it is to do a SOTA model from scratch!
If you relax the SOTA aspect, Karpathy's nanochat has you covered: https://github.com/karpathy/nanochat
This item has no comments currently.
It looks like you have JavaScript disabled. This web app requires that JavaScript is enabled.
Please enable JavaScript to use this site (or just go read Hacker News).
It should give you an idea of how hard it is to do a SOTA model from scratch!
If you relax the SOTA aspect, Karpathy's nanochat has you covered: https://github.com/karpathy/nanochat