Preferences

I think google is the only one that still produce general knowledge LLM right now

claude is coding model from the start but GPT is in more and more becoming coding model


I agree with this observation. Gemini does feel like code-red for basically every AI company like chatgpt,claude etc. too in my opinion if the underlying model is both fast and cheap and good enough

I hope open source AI models catch up to gemini 3 / gemini 3 flash. Or google open sources it but lets be honest that google isnt open sourcing gemini 3 flash and I guess the best bet mostly nowadays in open source is probably glm or deepseek terminus or maybe qwen/kimi too.

I would expect open weights models to always lag behind; training is resource-intensive and it’s much easier to finance if you can make money directly from the result. So in a year we may have a ~700B open weights model that competes with Gemini 3, but by then we’ll have Gemini 4, and other things we can’t predict now.
There will be diminishing returns though as the future models won't be thah much better we will reach a point where the open source model will be good enough for most things. And the need for being on the latest model no longer so important.

For me the bigger concern which I have mentioned on other AI related topics is that AI is eating all the production of computer hardware so we should be worrying about hardware prices getting out of hand and making it harder for general public to run open source models. Hence I am rooting for China to reach parity on node size and crash the PC hardware prices.

I had a similar opinion, that we were somewhere near the top of the sigmoid curve of model improvement that we could achieve in the near term. But given continued advancements, I’m less sure that prediction holds.
My model is a bit simpler: model quality is something like the logarithm of effort you put into making the model. (Assuming you know what you are doing with your effort.)

So I don't think we are on any sigmoid curve or so. Though if you plot the performance of the best model available at any point in time against time on the x-axis, you might see a sigmoid curve, but that's a combination of the logarithm and the amount of effort people are willing to spend on making new models.

(I'm not sure about it specifically being the logarithm. Just any curve that has rapidly diminishing marginal returns that nevertheless never go to zero, ie the curve never saturates.)

Yeah I have a similar opinion and you can go back almost a year when claude 3.5 launched and I said on hackernews, that its good enough

And now I am saying the same for gemini 3 flash.

I still feel the same way tho, sure there is an increase but I somewhat believe that gemini 3 is good enough and the returns on training from now on might not be worth thaat much imo but I am not sure too and i can be wrong, I usually am.

If Gemini 3 flash is really confirmed close to Opus 4.5 at coding and a similarly capable model is open weights, I want to buy a box with an usb cable that has that thing loaded, because today that’s enough to run out of engineering work for a small team.
Open weights doesn't mean you can necessarily run it on a (small) box.

If Google released their weights today, it would technically be open weight; but I doubt you'd have an easy time running the whole Gemini system outside of Google's datacentres.

Gemini isn't code red for Anthropic. Gemini threatens none of Anthropic's positioning in the market.
Yes it does. I never use Claude anymore outside of agentic tasks.
What demographic are you in that is leaving anthropic in mass that they care about retaining? From what I see Anthropic is targeting enterprise and coding.

Claude Code just caught up to cursor (no 2) in revenue and based on trajectories is about to pass GitHub copilot (number 1) in a few more months. They just locked down Deloitte with 350k seats of Claude Enterprise.

In my fortune 100 financial company they just finished crushing open ai in a broad enterprise wide evaluation. Google Gemini was never in the mix, never on the table and still isn’t. Every one of our engineers has 1k a month allocated in Claude tokens for Claude enterprise and Claude code.

There is 1 leader with enterprise. There is one leader with developers. And google has nothing to make a dent. Not Gemini 3, not Gemini cli, not anti gravity, not Gemini. There is no Code Red for Anthropic. They have clear target markets and nothing from google threatens those.

I agree with your overall thesis but:

> Google Gemini was never in the mix, never on the table and still isn’t. Every one of our engineers has 1k a month allocated in Claude tokens for Claude enterprise and Claude code.

Does that mean y'all never evaluated Gemini at all or just that it couldn't compete? I'd be worried that prior performance of the models prejudiced stats away from Gemini, but I am a Claude Code and heavy Anthropic user myself so shrug.

Enterprise is slow. As for developers, we will be switching to Google unless the competition can catch up and deliver a similarly fast model.

Enterprise will follow.

I don't see any distinction in target markets - it's the same market.

so? agentic tasks is where the promised agi is for many of us
Open source models are riding coat tails, they are basically just distilling the giant SOTA models, hence perpetually being 4-6mos behind.
If this quantification of lag is anywhere near accurate (it may be larger and/or more complex to describe), soon open source models will be "simply good enough". Perhaps companies like Apple could be 2nd round AI growth companies -- where they market optimized private AI devices via already capable Macbooks or rumored appliances. While not obviating cloud AI, they could cheaply provide capable models without subscription while driving their revenue through increased device sales. If the cost of cloud AI increases to support its expense, this use case will act as a check on subscription prices.
Google already has dedicated hardware for running private LLMs: just look at what they're doing on the Google Pixel. The main limiting factor right now is access to hardware that's powerful enough, and especially has enough memory, to run a good LLM, which will happen eventually. Normally, by 2031 we should have devices with 400 GB of RAM, but the current RAM crisis could throw off my calculations...
So basically the proprietary models are devalued to almost 0 in about 4-6 months. Can they recover the training costs + profit margin every 4 months?
Coding is basically an edge case for LLMs too.

Pretty much every person in the first (and second) world is using AI now, and only small fraction of those people are writing software. This is also reflected in OAI's report from a few months ago that found programming to only be 4% of tokens.

That may be so, but I rather suspect the breakdown would be very different if you only count paid tokens. Coding is one of the few things where you can actually get enough benefit out of AI right now to justify high-end subscriptions (or high pay-per-token bills).
> Pretty much every person in the first (and second) world is using AI now

This sounds like you live in a huge echo chamber. :-(

All of my non techy friends use it, it's the new search engine. I think at this point people refusing to use it are the echo chamber.
Depends what you count as AI (just googling makes you use the LLM summary), but also my mother who is really not tech affine loved what google lense can do, after I showed her.

Apart from my very old grandmothers, I don't know anyone not using AI.

How many people do you know? Do you talk to your local shop keeper? Or the clerk at the gas station? How are they using AI? I'm a pretty techy person with a lot of tech friends, and I know more people not using AI (on purpose, or lack of knowledge) then do.
I live in India and a surprising number of people here are using AI.

A lot of public religious imagery is very clearly AI generated, and you can find a lot of it on social media too. "I asked ChatGPT" is a common refrain at family gatherings. A lot of regular non-techie folks (local shopkeepers, the clerk at the gas station, the guy at the vegetable stand) have been editing their WhatsApp profile pictures using generative AI tools.

Some of my lawyer and journalist friends are using ChatGPT heavily, which is concerning. College students too. Bangalore is plastered with ChatGPT ads.

There's even a low-cost ChatGPT plan called ChatGPT Go you can get if you're in India (not sure if this is available in the rest of the world). It costs ₹399/mo or $4.41/mo, but it's completely free for the first year of use.

So yes, I'd say many people outside of tech circles are using AI tools. Even outside of wealthy first-world countries.

Hm, quite some. Like I said, it depends what you count as AI.

Just googling means you use AI nowdays.

I'm sort of old but not a grandmother. Not using AI.

This item has no comments currently.