Preferences

His dismissal of smaller and local models suggests he underestimates their improvement potential. Give phi4 a run and see what I mean.

mprovost
You can disagree with his conclusions but I don't think his understanding of small models is up for debate. This is the person who created micrograd/makemore/nanoGPT and who has produced a ton of educational materials showing how to build small and local models.
khalic OP
I’m going to edit, it was badly formulated, he underestimates their potential for growth is what I meant by that
diggan
> underestimates their potential for growth

As far as I understood the talk and the analogies, he's saying that local models will eventually replace the current popular "mainframe" architecture. How is that underestimating them?

diggan
> suggests a lack of understanding of these smaller models capabilities

If anything, you're showing a lack of understanding of what he was talking about. The context is this specific time, where we're early in a ecosystem and things are expensive and likely centralized (ala mainframes) but if his analogy/prediction is correct, we'll have a "Linux" moment in the future where that equation changes (again) and local models are competitive.

And while I'm a huge fan of local models run them for maybe 60-70% of what I do with LLMs, they're nowhere near proprietary ones today, sadly. I want them to, really badly, but it's important to be realistic here and realize the differences of what a normal consumer can run, and what the current mainframes can run.

khalic OP
He understands the technical part, of course, I was referring to his prediction that large models will be always be necessary.

There is a point where an LLM is good enough for most tasks, I don’t need a megamind AI in order to greet clients, and both large and small/medium model size are getting there, with the large models hitting a computing/energy demand barrier. The small models won’t hit that barrier anytime soon.

vikramkr
Did he predict they'd always be necessary? He mostly seemed to predict the opposite, that we're at the early stage of a trajectory that has yet to have it's Linux moment
khalic OP
I understand, thanks for pointing that out
khalic OP
I edited to make it clearer
sriram_malhar
Of all the things you could suggest, a lack of understanding is not one that can be pinned on Karpathy. He does know his technical stuff.
khalic OP
We all have blind spots
diggan
Sure, but maybe suggesting that the person who literally spent countless hours educating others on how to build small models locally from scratch, is lacking knowledge about local small models is going a bit beyond "people have blind spots".
khalic OP
Their potential, not how they work, it was very badly formulated, just corrected it
TeMPOraL
He ain't dismissing them. Comparing local/"open" model to Linux (and closed services to Windows and MacOS) is high praise. It's also accurate.
khalic OP
This is a bad comparison
dist-epoch
I tried the local small models. They are slow, much less capable, and ironically much more expensive to run than the frontier cloud models.
khalic OP
Phi4-mini runs on a basic laptop CPU at 20T/s… how is that slow? Without optimization…
dist-epoch
I was running Qwen3-32B locally even faster, 70T/s, still way too slow for me. I'm generating thousands of tokens of output per request (not coding), running locally I could get 6 mil tokens per day and pay electricity, or I can get more tokens per day from Google Gemini 2.5 Flash for free.

Running models locally is a privilege for the rich and those with too much disposable time.

yencabulator
Try Qwen3-30B-A3B. It's MoE to an extent where its use of memory bandwidth looks more like a 3B model, and thus it typically goes faster.

This item has no comments currently.