Preferences

treesciencebot parent
the main question is going to be software stack. NVIDIA is already shipping NVFP4 kernels and perf is looking good. It took a really long time after MI300X's that the FP8 kernels were OK (not even good, compared to almost perfect FP8 support in NVIDIA side of things).

I will doubt that they will be able to reach %60-70 of the FLOPs in majority of the workloads (unless they hand craft and tune a specific GEMM kernel for their benchmark shape). But would be happy to be proven wrong, and go buy a bunch of them


pella
(related)

Tinygrad:

  "We've been negotiating a $2M contract to get AMD on MLPerf, but one of the sticking points has been confidentiality. Perhaps posting the deliverables on X will help legal to get in the spirit of open source!"

   "Contract is signed! No confidentiality, AMD has leadership that's capable of acting. Let's make this training run happen, we work in public on our Discord.
" https://x.com/__tinygrad__/status/1935364905949110532
LeonM
It still amazes me that George/Tinycorp somehow seems to get AMD on board every time, and being blissfully unaware that they are a very small player. See for example top comment here [0].

Don't get me wrong, I think it's impressive what he achieved so far, and I hope tiny can stay competitive in this market.

[0] https://www.hackerneue.com/item?id=36193625

roenxi
That top comment doesn't seem to have engaged completely with the context here. AMD fumbled trillions of dollars of value creation by mis-identifying what their hardware was for. Or perhaps it is more correct to say by being too dogmatic about what their hardware is for. They weren't in a position to be picky. They had a choice - they could continue making trillion-dollar mistakes until their board got sacked and the exec team replaced. Or they could maybe listen to some of the people who were technically correct regardless of their size in the market.

George is just some dude and I doubt AMD paid him much attention anywhere through this saga, but AMD had screwed up to the point where he could give some precise commentary about how they'd managed to duck and weave to avoid the overwhelming torrent of money trying to rush in and buy graphics hardware. They should make some time in their busy schedules to talk with people like that.

imtringued
People get on board with George Hotz because they share the frustration of using ROCm on consumer GPUs, where the experience has been insultingly dreadful to the point where I decided to postpone buying new AMD GPUs for at least a decade.

I'm not quite sure why he decided to pivot to datacenter GPUs where AMD has shown at least some commitment to ROCm. The intersection between users of tinygrad and people who use MI350s should essentially be George himself and no one else.

Most of those willing to work with AMD are very small players (with some notable exceptions). They are likely hopeful that the small players will grow.
For anyone interested in tracking max achievable matmul FLOPS for hardware and unaware, I highly recommend tracking Stas Bekman's mamf-finder results: https://github.com/stas00/ml-engineering/tree/master/compute...

This item has no comments currently.