Comment by khalic - Hacker Neue

khalic Jun 19, 2025 parent

His dismissal of smaller and local models suggests he underestimates their improvement potential. Give phi4 a run and see what I mean.

mprovost Jun 19, 2025

You can disagree with his conclusions but I don't think his understanding of small models is up for debate. This is the person who created micrograd/makemore/nanoGPT and who has produced a ton of educational materials showing how to build small and local models.

khalic OP Jun 19, 2025

I’m going to edit, it was badly formulated, he underestimates their potential for growth is what I meant by that

diggan Jun 19, 2025

> underestimates their potential for growth

As far as I understood the talk and the analogies, he's saying that local models will eventually replace the current popular "mainframe" architecture. How is that underestimating them?

diggan Jun 19, 2025

> suggests a lack of understanding of these smaller models capabilities

If anything, you're showing a lack of understanding of what he was talking about. The context is this specific time, where we're early in a ecosystem and things are expensive and likely centralized (ala mainframes) but if his analogy/prediction is correct, we'll have a "Linux" moment in the future where that equation changes (again) and local models are competitive.

And while I'm a huge fan of local models run them for maybe 60-70% of what I do with LLMs, they're nowhere near proprietary ones today, sadly. I want them to, really badly, but it's important to be realistic here and realize the differences of what a normal consumer can run, and what the current mainframes can run.

khalic OP Jun 19, 2025

He understands the technical part, of course, I was referring to his prediction that large models will be always be necessary.

There is a point where an LLM is good enough for most tasks, I don’t need a megamind AI in order to greet clients, and both large and small/medium model size are getting there, with the large models hitting a computing/energy demand barrier. The small models won’t hit that barrier anytime soon.

vikramkr Jun 19, 2025

Did he predict they'd always be necessary? He mostly seemed to predict the opposite, that we're at the early stage of a trajectory that has yet to have it's Linux moment

khalic OP Jun 19, 2025

I understand, thanks for pointing that out

khalic OP Jun 19, 2025

I edited to make it clearer

sriram_malhar Jun 19, 2025

Of all the things you could suggest, a lack of understanding is not one that can be pinned on Karpathy. He does know his technical stuff.

khalic OP Jun 19, 2025

We all have blind spots

diggan Jun 19, 2025

Sure, but maybe suggesting that the person who literally spent countless hours educating others on how to build small models locally from scratch, is lacking knowledge about local small models is going a bit beyond "people have blind spots".

khalic OP Jun 19, 2025

Their potential, not how they work, it was very badly formulated, just corrected it

TeMPOraL Jun 19, 2025

He ain't dismissing them. Comparing local/"open" model to Linux (and closed services to Windows and MacOS) is high praise. It's also accurate.

khalic OP Jun 19, 2025

This is a bad comparison

dist-epoch Jun 19, 2025

I tried the local small models. They are slow, much less capable, and ironically much more expensive to run than the frontier cloud models.

khalic OP Jun 19, 2025

Phi4-mini runs on a basic laptop CPU at 20T/s… how is that slow? Without optimization…

dist-epoch Jun 19, 2025

I was running Qwen3-32B locally even faster, 70T/s, still way too slow for me. I'm generating thousands of tokens of output per request (not coding), running locally I could get 6 mil tokens per day and pay electricity, or I can get more tokens per day from Google Gemini 2.5 Flash for free.

Running models locally is a privilege for the rich and those with too much disposable time.

yencabulator Jun 30, 2025

Try Qwen3-30B-A3B. It's MoE to an extent where its use of memory bandwidth looks more like a 3B model, and thus it typically goes faster.

This item has no comments currently.