blixt parent
I think most people have read it and agree it makes an astute observation about surviving methods, but my point is that now we use it to complain about new methods that should just skip all that in between stuff so that The Bitter Lesson doesn't come for them. At best you can use it as an inspiration. Anyway, this was mostly a complaint about the use of "The Bitter Lesson" in the context of this article, it still deserves credit for all the great information about tokenization methods and how one evolutionary branch of them is the Byte Latent Transformer.