Preferences

According to Jensen it takes about 8000 H100s running for 90 days to train a 1.8 Trillion param MoE GPT-4 scale model.

Meta has about 350,000 of these GPUs and a whole bunch of A100s. This means the ability to train 50 GPT-4 scale models every 90 days or 200 such models per year.

This level of overkill suggests to me that the core models will be commoditized to oblivion, making the actual profit margins from AI-centric companies close to 0, especially if Microsoft and Meta keep giving away these models for free.

This is actually terrible for investors, but amazing for builders (ironically).

The real value methinks is actually over the control of proprietary data used for training which is the single most important factor for model output quality. And this is actually as much an issue for copyright lawyers rather than software engineers once the big regulatory hammers start dropping to protect American workers.


> This means the ability to train 50 GPT-4 scale models every 90 days or 200 such models per year.

Not anywhere close to that.

Those 350k GPUs you talk about aren't linked together. They also definitely aren't all H100s.

To train a GPT-4 scale model you need a single cluster, where all the GPUs are tightly linked together. At the scale of 20k+ GPUs, the price you pay in networking to link those GPUs is basically almost the same as the price of those GPUs themselves. It's really hard and expensive to do.

FB has maybe 2 such clusters, not more than that. And I'm somewhat confident one of those cluster is an A100 cluster.

So they can train maybe 6 GPT-4 every 90 days.

I had to take a second look at this: https://www.datacenterdynamics.com/en/news/meta-to-operate-6...

340,000 H100s 600,000 H100 equivalents (perhaps AMD Instinct cards?) On top of the hundreds of thousands of legacy A100s.

And I'm certain the order for B100s will be big. Very big.

Even the philanthropic org Chan-Zuckerberg institute current rocks 1000 H100s, probably none used for inference.

They are going ALL OUT

They are going ALL OUT

Just like they did for their metaverse play, and that didn't work out very well.

I honestly don't think we've seen the end of AR/VR yet. The tech continues to improve year over year. There are rumors the prototype Zuck plans to show at Meta Connect this year are mindblowing
Better VR tech won't make people buy VR. You could literally offer them a Star Trek holodeck and they still wouldn't buy in. People don't buy it because they don't see the point.

This was even true in Star Trek. People could do literally anything on a holodeck and the writers still had them going to Risa for a holiday.

There is no chance of VR going mainstream until someone solves the fundamental human problem of people preferring to do things in real life.

> rumors... prototype... mindblowing

Sounds like more unsubstantiated hype from a company desperate to sell a product that was very expensive to build. I guess we'll see, but I'm not optimistic for them.

Oh, I'm sure it'll be mind-blowing to see the emperor without clothes again.
> Even the philanthropic org Chan-Zuckerberg institute current rocks 1000 H100s, probably none used for inference.

What do they use them for?

tax writeoffs
Ok, I might have misread some rumored ballpark figures. And most of the GPUs will be used for inference rather than training. Still 6 GPT-4's every 90 days is pretty amazing.
It's like someone thinking that they are SOOOO smart, they are going to get rich selling shovels in the gold rush. So they overpay for the land, they overpay for the factory, they overpay for their sales staff.

And then someone else starts giving away shovels for free.

> And then someone else starts giving away shovels for free.

Ah, I see -- it's more like a "level 2 gold rush".

So a level 1 gold rush is: There's some gold in the ground, nobody knows where it is, so loads of people buy random bits of land for the chance to get rich. Most people lose, a handful of people win big. But the retailers buying shovels at wholesale and selling them at a premium make a safe, tidy profit.

But now that so many people know the maxim, "In a gold rush, sell shovels", there's now a level 2 gold rush: A rush to serve the miners rushing to find the gold. So loads of retailers buy loads and loads of shovels and set up shop in various places, hoping the miners will come. Probably some miners will come, and perhaps those retailers will make a profit; but not nearly as much as they expect, because there's guaranteed to be competition. But the company making the shovels and selling them at a premium makes a tidy profit.

So NVIDIA in this story is the manufacturer selling shovels to retailers; and all the companies building out massive GPU clouds are the retailers rushing to serve miners. NVIDIA is guaranteed to make a healthy profit off the GPU cloud rush as long as they play their cards right (and they've always done a pretty decent job of that in the past); but the vast majority of those rushing to build GPU clouds are going to lose their shirts.

And basically one AI company making all the money. Weird symbiosis.
> And then someone else starts giving away shovels for free.

And their business model is shovel-fleet logistics and maintenance... :p

The platform for shovel fleet logistics startups
SaaS (shoveling as a service)
And/or exploiting the legal infrastructure around intellectual property rights to make sure only hobbyists and geologists can use the shovels without paying through the nose or getting sued into oblivion.
If your company grows to 700 million monthly active users, then most probably you can make your own AI department and train your own models. I guess people's aspirations are very high in this space, but let's be realistic.
Their business model is of course tracking all the shovels and then selling the locations of all the gold.
It's almost like you can't actually control the demand side.
> once the big regulatory hammers start dropping to protect American workers

Have we been living in the same universe the last 10 years? I don't see this ever happening. Related recent news (literally posted yesterday) https://www.axios.com/2024/07/02/chevron-scotus-biden-cyber-...

I think people wildly underestimate how protectionist people - particularly educated software engineers and PhDs will get once an AI model directly impacts their source of wealth.

Red state blue collar workers got their candidate to pass tariffs. What happens when both blue state white collar workers and red state blue collar workers need to contest with AI. Perhaps not within the next 10 years, but certainly within 20 years!

And if you think 20 years is a long time... 2004 was when Halo 2 came out

> I think people wildly underestimate how protectionist people - particularly educated software engineers and PhDs will get once an AI model directly impacts their source of wealth.

I don't know what power you imagine SWEs and PhDs posses, but the last time their employers flexed their power by firing them in droves (despite record profits); the employees sure seemed powerless, and society shrugged it off and/or expressed barely-concealed schadenfreude.

They were sued for collusion and the lawyers got a massive payout and the employees got a fraction of lost wages. I was one of them. (Employees not lawyers.)
The Apple et al suit was many years before the post-covid layoffs: which I was referring to in my comment.

That settlement favored Apple, Google and the other conspirators because they only paid out a fraction of what they would have paid in salaries absent the collusion - so the settlement was not exactly a show of force by the engineers. Additionally, this was after a judge had thrown out a lower settlement amount the lawyers representing the class had agreed to.

It's not going to stop it though even if they try though. You can't stop technical progress like this any more than you can stop piracy.

But agreed, between the unions with political pull and "AI safety" grifters I suspect there could be some level of regulatory risk, particularly for the megacorps in California. I doubt it will be some national thing in the US absent a major political upheaval. Definitely possible in the EU which will probably just be a price passed on to customers or reduced access, but that's nothing new for them.

I keep seeing people say you can’t stop progress (social, technical, etc.) but has this really been tested? There seems to be a lot of political upheaval at least being threatened on the near future, and depending on the forces that come into power I imagine they may be willing to do a lot to protect that power.

Tucker Carlson at one point said if FSD was going to take away trucking jobs we should stop that with regulation.

There are amistics - the voluntary non-use of technology once it's available - which all cultures engage in to greater or lesser degrees. All technology has a price and sometimes it's not worth it - see leaded gasoline for an extreme example.

But in the general sense, I think it's tautologically correct to say better models always lead to better predictions, which always give an edge in competitions on an individual or societal level. So long term I do believe learning trumps ignorance, not in all cases but on average.

Hopefully that time AI will be working for us in our homes, stores and farms so we don't need to work as much and this is ok.
People will still need purpose, which for better or worse is often provided by their job.
Most of the new policies and court decisions, including the one I linked in my reply, have been disjoint with the will of the people. I don't see how your point about shifting sentiment (which I very much agree with) means we'll see any sort of regulatory action from the federal government, especially in the US. Maybe in the EU, someday, after most of the damage has been done?

> What happens when both blue state white collar workers and red state blue collar workers need to contest with AI. Perhaps not within the next 10 years, but certainly within 20 years!

Populism. Probably the fascist right-wing kind, but I expect some form of populism. Related, if we're talking about a 20 year time horizon, I'm genuinely unsure if society will still exist in any recognizable fashion at the rate we're going...

The only upside is state-level minimum wage increases. The federal minimum wage is still a complete joke at $7.25 an hour.

But there's bigger fish to fry for American politics and worker obsolescence is not really top of mind for anyone.

> making the actual profit margins from AI-centric companies close to 0

The same thinking stopped many legacy tech companies from becoming a “cloud” company ~20 years ago.

Fast forward to today and the margin for cloud compute is still obscene. And they all wish in hindsight they got into the cloud business way sooner than they ultimately did.

That's not the way I remember the cloud transition at all. My company adopted an internal cloud. VMWare had some great quarters. Openstack got really big for a while and everyone was trying to hire for it. All the hosting companies started cloud offerings.

What ended up happening was Amazon was better at scale and lockin than everyone else. They gave Netflix a sweet deal and used it as a massive advertisement. It ended up being a rock rolling down a hill and all the competitors except ones with very deep pockets and the ability to cross-subsidize from other businesses (MSFT and Google) got crushed.

It still blows my mind that Microsoft is the most valuable company in the planet because of the cloud and Balmers long term vision. I thought they would have gone the way of IBM.
Which, IBM booked about $62 billion in revenue for 2023.

I thought Nvidia recently took that crown recently though.

Agree. Ballmer seems to have done his job well.
While I agree, he also admits his biggest miss was phone/hardware (which is what catapulted Apple).

https://youtu.be/v9d3wp2sGPI?feature=shared

It’s not just the software, it’s the hardware too. Too many companies got good at speeding up VM deployments but ignored theory of constraints and still had a 4-6 month hardware procurement process that gave everyone and their dog veto power.

And then you come to companies that managed to streamline both and ran out of floor space in their data center because they had to hold onto assets for 3-5 years. At one previous employer, the smallest orderable unit of compute was a 44U rack. They eventually filled the primary data center and then it took them 2 years to Tetris their way out of it.

Are the second-tier cloud companies really seeing big margins? Why is it not competed away to zero like airlines?
There is essentially zero cost for a user to switch airlines. The cost to switch clouds is astronomical for any decent sized org.
Those poor little AI clouds will never keep people reeled in unless they invent something like CUDA.
I like the sentiment of your post. I mostly agree. If you use OpenShift, doesn't that help to reduce the cost of switching cloud providers?
You're not switching cloud providers. Amazon's not going to suddenly decide to jack up rates for EC2 instances on you. So the extra complexity just isn't worth it.

There is a hypothetical "but what if we honestly actually really really do", but that's such a waste of engineering time when there are so many other problems to be solved that it's implausible. The only time multi-cloud makes sense is when you have to meet customers where they're at, and have resources in whichever cloud your customers are using. Or if you're running arbitrage between the clouds and are reselling compute.

Not really. What happens when you run a cloud on top of your cloud is that you don’t get to use any of their differentiating features and that winds up costing you money. Plus you have to pay for your own control plane when that’s already baked into the cloud provider’s charge model.

    > Plus you have to pay for your own control plane when that’s already baked into the cloud provider’s charge model.
When you say "control plane" does this mean Kubernetes?
There will probably be 2 huge winners, and everyone else will fail. Similar to the solar boom.
Who are the winners in solar?
per Zuckerberg[0], ~half of their H100s were for Reels content recommendation:

> I think it was because we were working on Reels. We always want to have enough capacity to build something that we can't quite see on the horizon yet. ... So let's order enough GPUs to do what we need to do on Reels and ranking content and feed. But let's also double that.

So there's an immense capacity inside Meta, but the _whole_ fleet isn't available for LLM training.

[0]: https://www.dwarkeshpatel.com/p/mark-zuckerberg?open=false#§...

Surely they’re using some of that hardware to overcome Apple’s attempts to deprive them of targeted advertising data.
In my opinion Elsevier and others charging for access to academic publications has held back the advancement of humanity to a lower exponential acceleration into the future at a considerable factor. Think of how cancer could have been cured a decade ago if information was allowed to flow freely from the 50's forward - if anyone could have read scientific publications for free. I have no respect for people that want to protect the moat around information that could be used to advance humanity.
Have to disagree. Almost all researchers have essentially unfettered access to all of biomedical literature. Access to papers is therefore a tertiary annoyance wrt progress in science and the cures for cancers.

What IS a huge problem is the almost complete lack of systematically acquired quantitative data on human health (and diseases) for a very large number (1 million subjects) of diverse humans WITH multiple deep-tissue biopsies (yes, essentially impossible) that srr suitable for multiomics at many ages/stages and across many environments. (Note, we can do this using mice.)

Some specific examples/questions to drive this point home: What is the largest study of mRNA expression in humans? ANSWER: The small but very expensive NUH GTEx study (n max of about 1000 Americans). This study acquired postmortem biopsies for just over 50 tissues. And what is the largest study of protein expression in humans across tissues? Oh sorry, this has never been done although we know proteins are the work-horses of life. What about lipids, metabolites, metagenomics, epigenomics? Sorry again, there is no systematically acquired data at all.

What we have instead is a very large cottage-industry of lab-level studies that are structurally incoherent.

Some brag about the massive biomedical data we have, but it is truly a ghost and most real data evaporates with a few years.

Here is my rant on fundamental data design flaws and fundamental data integration flaws in biomedical research:

Herding Cats: The Sociology of Data Integration https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2751652/

I also think the bottleneck isn't access to the papers today and data access and silos are more important.

But I also think the GP's claim and yours are not incompatible. I wonder how much survivorship bias this has since it only considers those that are able to do research, and not those that would have but ended up doing continuing with another STEM job. We could be asking the counterfactual that I think the GP is implying: would more people have been interested in becoming cancer researchers if publications were open?

We can sort of see the effect because we have scihub now, which basically unlocks journal access for those that are comfortable with it, and I consider it plausibly having a significant effect for the population that have a research background without an academic affiliation. I've met a few biotech startup founders that switched from tech to bio and did self study+scihub outside of the university. The impetus for change I've heard a few times is a loved one got X disease, and I studied it, quit my less impactful tech job to work on bio stuff.

As much as I'd love open access to academic publications and don't think the current model is great:

> Think of how cancer could have been cured a decade ago if information was allowed to flow freely from the 50's forward

might be a bit fanciful? Unless you're referring to something particular I'm unaware of.

The people best equipped and trained to deliver a cure for cancer (and then some, since it tends not to be particularly field-restricred) do have access.

I think the loss is more likely in engineering (to the publication's science), cheaper methods, more reliably manufacturable versions of lab prototypes, etc.

I doubt there are many people capable of cancer research breakthroughs who don't have access to cancer research, personally.

(And to be clear: I'm not capable of it.)

I’ll add that even if the papers we all wanted were more freely accesible, the replication and completeness of their described methods would be another source of slowdown.
Main problem is still just getting good quantitative data and metadata. Most biomedical researchers are motivated to “tell stories”. Few of us care about generating huge mineable data sets.
All of the engineering companies I’ve worked for have not paid for IEEE or any journals. I have to go to the library and maintain membership for IEEE myself then request reimbursement.

The schools I’ve worked with have access to everything I’ve needed. They didn’t advertise it but it’s also free for students.

Not to mention there is not singular “cancer” - there are many types and they’re all sufficiently different to make the problem much more challenging.
1) First, most researchers at universities or other institutions have always had unfettered access thanks to a site license. It would be pretty hard to find a real example of a university researcher who couldn't see something.

2) There may be a few researchers who don't have unfettered access. Perhaps they paid $40 for a copy of a paper. Given the high cost of other parts of research labs, I find it hard to believe that any real possibility of curing cancer was halted because someone had to pay $40.

3) It's possible to imagine the opposite being the case. Perhaps someone had a key insight in a clever paper and decided to distribute it for free out of some info-anarchistic impulses. There it would sit in some FTP directory uncrated, unindexed and uncared for. Perhaps the right eyes would find it. Perhaps they wouldn't. Perhaps the cancer researcher would be able to handle all of the LaTeX and FTP chores without slowing down research. Perhaps they would be distracted by sys admin headaches and never make a crucial follow up discovery.

The copyrighted journal system provides curation and organization. Is it wonderful? Nah. Is it better than some ad hoc collection of FTP directories? Yes!

Your opinion may be that this scenario would never happen. In my opinion, this is more likely than your vision.

99%+ of the people doing scientific work in curing cancer have access to all the relevant medical and scientific journals.
You have to not latch on to causes such as "advance humanity" and then justify making people do work for free. We decided a while ago[0] that making people work for free was a bad thing. There is high demand for curing cancer. Every company that tries it will hire scientists and lab techs and large laboratories, and have subscriptions to journals. Do you think all of those people should work for free, in the cause of advancing humanity?

[0] https://en.wikipedia.org/wiki/Slavery_Abolition_Act_1833

But anyone who needs to see these publications can reach them through libraries. One of the reasons why Elsevier can charge so much is that their customers have been institutions.
This depends a lot on the country and the library.
Eh, most published research papers are wrong anyway.
A lot of those GPUs are for their 3B users to run inferencing, no?
It’s been a very long time since I had any inside baseball, but I very much doubt that Hopper gear is in the hot inference path.

The precisions and mantissa/exponent ratios you want for inference are just different to a mixed-precision, fault tolerant, model and data parallel pipeline.

Hopper is for training mega-huge attention decoders: TF32, bfloat16, hot paths to the SRAM end of the cache hierarchy with cache coherency semantics that you can reason about. Parity gear for fault tolerance, it’s just a different game.

True that, but I think in a very short amount of time, using dedicated general purpose GPUs just for inferencing is going to be mega overkill.

If there's dedicated inferencing silicon (like say the thing created by Groq), all those GPUs will be power sucking liabilities, and then the REAL singularity superintelligence level training can begin.

Etched is another dedicated inference hardware company that recently announced their product. It only works for transformer based models, but is ~20x faster than a H100
> The real value methinks is actually over the control of proprietary data used for training which is the single most important factor for model output quality.

Maybe. But we've barely scratched the surface of being more economical with data.

I remember back in the old days, there was lots of work on eg dropout and data augmentation etc. We haven't seen too much of that with the like of ChatGPT yet.

I'm also curious to see what the future of multimodal models holds: you can create almost arbitrarily amounts of extra data by pointing a webcam at the world, especially when combined with a robot, or letting your models also play StarCraft or Diplomacy against each other.

There are more kinds of AI-centric companies than just foundation models. Making that equivalency is akin to equating internet companies during the dotcom bubble with just websites like pets.com. Now one semi-skilled person in a couple days can make websites that entire teams back then would have taken months to build, but that doesn't mean google.com and thefacebook.com are easily commodotized or bad businesses just because they're websites.
> this means the ability to train 50 GPT-4 scale models every 90 days or 200 such models per year.

What it actually means is that they are training next gen models that are 50X larger.

And, considering MS and OpenAI are planning to build a $100 billion AI training computer, these 350K GPUs is just a tiny portion of what they are planning.

This isn't an overkill. This is the current plan: throw as much compute as possible and hope intelligence scales with compute.

Everything is if scaling keeps making the models better. If it does you don't train 50 gpt4s, you have the best model.
> but amazing for builders (ironically).

Could you expand on this? Who are "the builders" here? You mean the model developers? I don't see how this situation can be "amazing" for the builders - developers will just get a wage out of their work.

Propriety data is not necessary for training intelligence. Wikipedia, pubmed, arxiv, Reddit and github are probably sufficient. And babies don’t even use that.

I agree though that the returns on hardware rapidly diminish.

> once the big regulatory hammers start dropping to protect American workers.

The US Supreme Court seems determined to make sure that big regulatory hammers are not going to be dropping, from what I can tell.

Why are we assuming we're topping out at a GPT-4 scale model?
Llama isn't even in the same stratosphere as the big models when it comes to coding, logic, and other interesting tasks that I think are commercially viable.
I thought AI is supposed to put all the lawyers out of work.
Nothing will put lawyers or doctors out of work. They are powerful cartels that can easily protect themselves. Realtors are already irrelevant technologically but they have a huge entrenched social and legal system to make it impractical to compete.
Weren't lots of realtors recently put out of work in the US, at least?

When NAR settled the price collusion charge? Thus cartel or not, times do change.

My friend is a real estate agent, they play a major part in the psychology of buyers and sellers. Selling your dead parents home that you grew up in (for example) isn't something everyone just signs up to some website and does using a credit card without a second thought.

A good real estate agent can guide people through this process while advising them on selling at the right price while avoiding the most stress often during an extremely difficult time in their life, such as going through divorce of breakup. They of course also help keep buyers interested while the seller is making up their mind about the correct offer to take.

I find your comment ignorant in so many ways. Maybe have some respect?

Are you not just explaining "a huge entrenched social system" as OP said?

It takes a long time for cultures to shift and for people to start to trust information systems to entirely replace high touch stuff like that. And at some level there will always be some white glove service on top for special cases.

How is hiring a professional to help you sell a property a "huge entrenched social system" sorry ? No one is forced to hire a real estate agent. I bought my house through private sale.
Lexis+ AI and Ask Practical Law AI systems produced incorrect information more than 17% of the time, while Westlaw’s AI-Assisted Research hallucinated more than 34% of the time:

https://hai.stanford.edu/news/ai-trial-legal-models-hallucin...

Just out of curiosity, what's the human lawyer baseline on that?
> Just out of curiosity, what's the human lawyer baseline on that?

Largely depends on how much money the client has.

The failures are different from my experience in this.

Human lawyers fail by not being very zealous and most of them being very average, not having enough time to spend on any filings, and not having sufficient research skills. So really, depth-of-knowledge and talent. They generally won't get things wrong per se, but just won't find a good answer.

AI gets it wrong by just making up whole cases that it wishes existed to match the arguments it came up with, or that you are hinting that you want, perhaps subconsciously. AI just wants to "please you" and creates something to fit. Its depth-of-knowledge is unreal, its "talent" is unreal, but it has to be checked over.

It's the same arguments with AI computer code. I had AI create some amazing functions last night but it kept hallucinating the name of a method call that didn't exist. Luckily with code it's more obvious to spot an error like that because it simply won't compile, and in this case I got luckier than usual, in that the correct function did exist under another name.

it's the self-driving car problem. Humans aren't perfect either but people like to ignore that.
True, they're similar... But what's also similar is that people make the mistake of focusing on differences in failure rates while glossing over failure modes.

Human imperfections are a family of failure-modes which have a gajillion years of experience in detecting, analyzing, preventing, and repairing. Quirks in ML models... not so much.

A quick thought-experiment to illustrate the difference: Imagine there's a self-driving car that is exactly half as likely to cause death or injury than a human driver. That's a good failure rate. The twist is that its major failure mode is totally alien, where units attempt to inexplicably chase-murder random pedestrians. It would be difficult to get people to accept that tradeoff.

No, people have the correct intuition that human errors at human speeds are very different in nature from human rate errors at machine speeds.

It's one thing if a human makes a wrong financial decision or a wrong driving decision, it's another thing if a model distributed to ten million computers in the world makes that decision five million times in one second before you can notice it's happening.

It's why if your coworker makes a weird noise you ask what's wrong, if the industrial furnace you stand next to makes a weird noise you take a few steps back.

I'm sure it's no where near good enough yet, but a legal model getting the answer right 83% of the time is still quite impressive imo.
What percentage of GPUs are being used for training versus inference?
The infinitely expanding AI-generated metaverse isn't going to render itself, at least in the case of meta I think that might be one of the only pieces missing.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal