Comment by rikafurude21

rikafurude21 Jan 25, 2025 parent

Why do americans think china is like a hivemind controlled by an omnisicient Xi, making strategic moves to undermine them? Is it really that unlikely that a lab of genius engineers found a way to improve efficiency 10x?

faitswulff Jan 25, 2025

China is actually just one person (Xi) acting in perfect unison and its purpose is not to benefit its own people, but solely to undermine the West.

Zamicol Jan 25, 2025

If China is undermining the West by lifting up humanity, for free, while ProprietaryAI continues to use closed source AI for censorship and control, then go team China.

There's something wrong with the West's ethos if we think contributing significantly to the progress of humanity is malicious. The West's sickness is our own fault; we should take responsibility for our own disease, look critically to understand its root, and take appropriate cures, even if radical, to resolve our ailments.

Krasnol Jan 25, 2025

> There's something wrong with the West's ethos if we think contributing significantly to the progress of humanity is malicious.

Who does this?

The criticism is aimed at the dictatorship and their politics. Not their open source projects. Both things can exist at once. It doesn't make China better in any way. Same goes for their "radical cures" as you call it. I'm sure Uyghurs in China would not give a damn about AI.

drysine Jan 25, 2025

> I'm sure Uyghurs in China would not give a damn about AI.

Which reminded me of "Whitey On the Moon" [0]

[0] https://www.youtube.com/watch?v=goh2x_G0ct4

bdangubic Jan 25, 2025 (dead)

dr_dshiv Jan 25, 2025

This explains so much. It’s just malice, then? Or some demonic force of evil? What does Occam’s razor suggest?

Oh dear

layer8 Jan 25, 2025

Always attribute to malice what can’t be explained by mere stupidity. ;)

pjc50 Jan 25, 2025

You missed the really obvious sarcasm.

nejsjsjsbsb Jan 26, 2025

Never a true a word as said in jest

buryat Jan 25, 2025

payback for Opium Wars

mackyspace Jan 25, 2025

China is doing what it's always done and its culture far predates "the west".

colordrops Jan 25, 2025

Can't tell if sarcasm. Some people are this simple minded.

rightbyte Jan 25, 2025

Ye, but "acting in perfect unison" would be a superior trait among people that care about these things which gives it a way as sarcasm?

suraci Jan 26, 2025

many americans do seem to view Chinese people as NPCs, from my perspective, but I don't know it's only for Chinese or it's also for people of all other cultures

it's quite like Trump's 'CHINA!' yelling

I don't know, just a guess

rambojohnson Jan 25, 2025

that's the McCarthy era red scare nonsense still polluting the minds of (mostly boomers / older gen-x) americans. it's so juvenile and overly simplistic.

Flemilkok Jan 25, 2025 (dead)

logicchains Jan 25, 2025

> Is it really that unlikely that a lab of genius engineers found a way to improve efficiency 10x

They literally published all their methodology. It's nothing groundbreaking, just western labs seem slow to adopt new research. Mixture of experts, key-value cache compression, multi-token prediction, 2/3 of these weren't invented by DeepSeek. They did invent a new hardware-aware distributed training approach for mixture-of-experts training that helped a lot, but there's nothing super genius about it, western labs just never even tried to adjust their model to fit the hardware available.

rvnx Jan 25, 2025

"nothing groundbreaking"

It's extremely cheap, efficient and kicks the ass of the leader of the market, while being under sanctions with AI hardware.

Most of all, can be downloaded for free, can be uncensored, and usable offline.

China is really good at tech, it has beautiful landscapes, etc. It has its own political system, but to be fair, in some way it's all our future.

A bit of a dystopian future, like it was in 1984.

But the tech folks there are really really talented, it's long time that China switched from producing for the Western clients, to direct-sell to the Western clients.

gpm Jan 25, 2025

The leaderboard leader [1] is still showing the traditional AI leader, Google, winning. With Gemini-2.0-Flash-Thinking-Exp-01-21 in the lead. No one seems to know how many parameters that has, but random guesses on the internet seem to be low to mid 10s of billions, so fewer than DeepSeek-R1. Even if those general guesses are wrong, they probably aren't that wrong and at worst it's the same class of model as DeepSeek-R1.

So yes, DeepSeek-R1 appears to be not even be best in class, merely best open source. The only sense in which it is "leading the market" appears to be the sense in which "free stuff leads over proprietary stuff". Which is true and all, but not a groundbreaking technical achievement.

The DeepSeek-R1 distilled models on the other hand might actually be leading at something... but again hard to say it's groundbreaking when it's combining what we know we can do (small models like llama) with what we know we can do (thinking models).

[1] https://lmarena.ai/?leaderboard

dinosaurdynasty Jan 25, 2025

The chatbot leaderboard seems to be very affected by things other than capability, like "how nice is it to talk to" and "how likely is it to refuse requests" and "how fast does it respond" etc. Flash is literally one of Google's faster models, definitely not their smartest.

Not that the leaderboard isn't useful, I think "is in the top 10" says a lot more than the exact position in the top 10.

gpm Jan 25, 2025

I mean, sure, none of these models are being optimized for being the top of the leader board. They aren't even being optimized for the same things, so any comparison is going to be somewhat questionable.

But the claim I'm refuting here is "It's extremely cheap, efficient and kicks the ass of the leader of the market", and I think the leaderboard being topped by a cheap google model is pretty conclusive that that statement is not true. Is competitive with? Sure. Kicks the ass of? No.

whimsicalism Jan 25, 2025

google absolutely games for lmsys benchmarks with markdown styling. r1 is better than google flash thinking, you are putting way too much faith in lmsys

patrickhogan1 Jan 25, 2025

There is a wide disconnect between real world usage and leaderboards. If gemini was so good why are so few using them?

Having tested that model in many real world projects it has not once been the best. And going farther it gives atrocious nonsensical output.

whimsicalism Jan 25, 2025

i’m sorry but gemini flash thinning is simply not as good as r1. no way you’ve been playing with both

varelse Jan 25, 2025 (dead)

meltyness Jan 25, 2025

The U.S. firms let everyone skeptical go the second they had a marketable proof of concept, and replaced them with smart, optimistic, uncritical marketing people who no longer know how to push the cutting edge.

Maybe we don't need momentum right now and we can cut the engines.

Oh, you know how to develop novel systems for training and inference? Well, maybe you can find 4 people who also can do that by breathing through the H.R. drinking straw, and that's what you do now.

Scipio_Afri Jan 25, 2025

That's what they claim at least in the paper but that particular claim is not verifiable. The HAI-LLM framework they reference in the paper is not open sourced and it seems they have no plans to.

Additionally there are claims, such as those by Scale AI CEO Alexandr Wang on CNBC 1/23/2025 time segment below, that DeepSeek has 50,000 H100s that "they can't talk about" due to economic sanctions (implying they likely got by avoiding them somehow when restrictions were looser). His assessment is that they will be more limited moving forward.

https://youtu.be/x9Ekl9Izd38?t=178

byefruit Jan 25, 2025

It's amazing how different the standards are here. Deepseek's released their weights under a real open source license and published a paper with their work which now has independent reproductions.

OpenAI literally haven't said a thing about how O1 even works.

huangruoyu Jan 27, 2025

DeepSeek the holding company is called high-flyer, they actually do open source their AI training platform as well, here is the repo: https://github.com/HFAiLab/hai-platform

Scipio_Afri Jan 28, 2025

Last update was 2 years ago before H100s or H800 existed. No way it has the optimized code that they used in there

Trioxin Jan 29, 2025

Who independently reproduced it? I haven't found such a thing.

huangruoyu Jan 27, 2025

it's open source, here is their platform called hai: https://github.com/HFAiLab/hai-platform

Scipio_Afri Jan 28, 2025

Last update was 2 years ago before H100s or H800 existed. No way it has the optimized code that they used in there

marbli2 Jan 25, 2025

They can be more open and yet still not open source enough that claims of theirs being unverifiable are still possible. Which is the case for their optimized HAI-LLM framework.

byefruit Jan 25, 2025

That's not what I'm saying, they may be hiding their true compute.

I'm pointing out that nearly every thread covering Deepseek R1 so far has been like this. Compare to the O1 system card thread: https://www.hackerneue.com/item?id=42330666

Very different standards.

varelse Jan 25, 2025 (dead)

blackeyeblitzar Jan 25, 2025

But those approaches alone wouldn’t yield the improvements claimed. How did they train the foundational model upon which they applied RL, distillations, etc? That part is unclear and I don’t think anything they’ve released anything that explains the low cost.

It’s also curious why some people are seeing responses where it thinks it is an OpenAI model. I can’t find the post but someone had shared a link to X with that in one of the other HN discussions.

wumeow Jan 25, 2025

Because that’s the way China presents itself and that’s the way China boosters talk about China.

bugglebeetle Jan 25, 2025

I mean what’s also incredible about all this cope is that it’s exactly the same David-v-Goliath story that’s been lionized in the tech scene for decades now about how the truly hungry and brilliant can form startups to take out incumbents and ride their way to billions. So, if that’s not true for DeepSeek, I guess all the people who did that in the U.S. were also secretly state-sponsored operations to like make better SAAS platforms or something?

blackeyeblitzar Jan 25, 2025

Well it is like a hive mind due to the degree of control. Most Chinese companies are required by law to literally uphold the country’s goals - see translation of Chinese law, which says generative AI must uphold their socialist values:

https://www.chinalawtranslate.com/en/generative-ai-interim/

In the case of TikTok, ByteDance and the government found ways to force international workers in the US to signing agreements that mirror local laws in mainland China:

https://dailycaller.com/2025/01/14/tiktok-forced-staff-oaths...

I find that degree of control to be dystopian and horrifying but I suppose it has helped their country focus and grow instead of dealing with internal conflict.

dutchbookmaker Jan 26, 2025

I think it is because we conflate the current Chinese system with the old Mao/Soviet Union system because all call themselves "communist".

The vast majority are completely ignorant of what Socialism with Chinese characteristics mean.

I can't imagine even 5% of the US population knows who Deng Xiaoping was.

The idea there are many parts of the Chinese economy that are more Laissez-faire capitalist than anything we have had in the US in a long time would just not compute for most Americans.

MIA_Alive Jan 26, 2025

Yeah, it's mind boggling how sinophobic online techies are. Granted, Xi is in sole control of China, but this seems like it's an independent group that just happened to make breakthrough which explains their low spend.

diego_moita Jan 25, 2025

SAY WHAT?

Do you want an Internet without conspiracy theories?

Where have you been living for the last decades?

suraci Jan 26, 2025 (dead)

mritchie712 Jan 25, 2025

think about how big the prize is, how many people are working on it and how much has been invested (and targeted to be invested, see stargate).

And they somehow yolo it for next to nothing?

yes, it seems unlikely they did it exactly they way they're claiming they did. At the very least, they likely spent more than they claim or used existing AI API's in way that's against the terms.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous