He understands the technical part, of course, I was referring to his prediction that large models will be always be necessary.
There is a point where an LLM is good enough for most tasks, I don’t need a megamind AI in order to greet clients, and both large and small/medium model size are getting there, with the large models hitting a computing/energy demand barrier. The small models won’t hit that barrier anytime soon.
If anything, you're showing a lack of understanding of what he was talking about. The context is this specific time, where we're early in a ecosystem and things are expensive and likely centralized (ala mainframes) but if his analogy/prediction is correct, we'll have a "Linux" moment in the future where that equation changes (again) and local models are competitive.
And while I'm a huge fan of local models run them for maybe 60-70% of what I do with LLMs, they're nowhere near proprietary ones today, sadly. I want them to, really badly, but it's important to be realistic here and realize the differences of what a normal consumer can run, and what the current mainframes can run.