Preferences

Wouldn't a model that can recite training data verbatim be larger than necessary? Exact text isn't coming from nowhere, no matter how efficiently the bits are encoded, and the same effectiveness should be achievable by compressing those portions of the model.

zeven7
Maybe we are all just LLMs. If the books were written by a language producing algorithm in a human mind, maybe there’s not as much raw data there as it seems, and the total information can in fact be stored in a surprisingly small set of weights.
ethbr1 OP
I imagine it's not inconceivable that at very high dimensions and with the right architectures stochastic compression can be unexpectedly efficient. It would be strange if the end result of AI research is realizing we're solving a compression problem (and that our brains do too).

This item has no comments currently.