Are you saying you disagree that a new architectural leap is needed and just more compute for training is enough? Or are you saying a new architectural leap is needed and that or those new architectures will only be possible to train with insane amounts of compute?
If the latter I dont understand how you could know that about an innovation that’s not yet been made
Are you saying you disagree that a new architectural leap is needed and just more compute for training is enough? Or are you saying a new architectural leap is needed and that or those new architectures will only be possible to train with insane amounts of compute?
If the latter I dont understand how you could know that about an innovation that’s not yet been made