Preferences

fancyfredbot parent
Looks to be a smart play from NVIDIA to recapture the local LLM market, this looks very competitive with Apple M4 and AMD AI Max, but cheaper(!).

Nvidia and cheap don't go together most of the time so I think they must have been very worried by developers and enthusiasts buying other hardware.

Consistent with their very developer focused strategy and probably a very smart idea since it's low enough spec to avoid canabalising sales of their existing products.


fancyfredbot OP
Too late to edit my post but wanted to apologise for causing confusion.

I suppose "cheapest" can be a very subjective term if the comparison is between things with different capabilities.

NVIDIA is a cheaper option than AMD only if you compare assume you want/need the fast networking NVIDIA are bundling with their system. According to the article the NIC would add $1500-$2000 to the price of another systems. I also failed to account for the extra memory bandwidth offered by M4 max. The apple system costs more but if you want/need that bandwidth then it's the cheapest of the three.

I guess "the system has a niche where it offers very good price/performance" is what I should have said. Not as snappy though.

bitsandboots
I'd say the SGX is the "cheapest" only if you're trying to go beyond 96GB.

the AMD system is 96GB max for the GPU. The 128GB allocates a minimum of 32GB to the CPU.

the Nvidia system is designed to be connected to a second if so desired, making it the cheapest way to get 256GB.

If you're just going for something under 96GB, haven't seen something cheaper than the AMD system for anything that can't fit on a traditional GPU. And even then, GPUs are obscene ripoff prices lately. Here's hoping these won't be scalped too.

ezschemi
The 96GB is the maximum on Windows. On Linux, you can allocate 110GB.
tracker1
AMD AI Max+ 395 systems should be around $2k with 128gb DDR5X, the Mac Studio maxed out is just north of $12k. So it's in the middle depending. This will also vary greatly in terms of usability for other things.

That said, a nice OEM option if you need something for AI workloads where the GPU market is completely soaked with scalpers. Been considering a drive to California just so I can get a GPU directly at MicroCenter instead of paying scalper overheads.

sliken
Yes, you can spend $12k on a mac studio. However the Mac studio with the m4 max has 128GB ram, double the memory bandwidth, and costs $3,699. LESS than the DGX spark. Granted it doesn't have 200gbe.

If it's the memory bandwidth you are after the Mac Mini with the m4 pro has similar, but max 64GB ram.

fancyfredbot OP
Article is the ASUS system which is $3000, so cheaper than the M4 max.
sliken
Indeed, 23% more money for double the memory bandwidth.
rbanffy
> This will also vary greatly in terms of usability for other things.

This is key. Nvidia has a terrible reputation with long term support (as market leaders, they can easily afford that). Apple just now (last November) dropped OS updates for their 2014 boxes. While a Mac Studio 2025 will not be a ridiculous amount of compute power in 10 years, I fully expect Nvidia to completely abandon support and software updates for this in five years tops.

Hopefully, considering the interest it generated, I'd hope the Linux crowd will be able to carry it further, maybe with decent open-source drivers, way past the expiration date Nvidia sets.

tracker1
That'll be the hard part for sure... NVidia is in a position to want to push people to abandon older tech for new shiny. I would hope to see these machines last a decade all the same. Also interested to see how the level of compute compares to other pro and consumer options.
bitsandboots
> Nvidia has a terrible reputation with long term support

In what space do they have this reputation? In drivers, I see they're supporting hardware that's 10 years old right now.

scottapotamas
Their single board computers intended for robotics/edge have had a history of being poorly supported and stuck on old kernel versions.
Curious which single board(s) would these be? The latest orin nano super cards seem to have updated software. I have read good things about Nvidia Shield support - it is still the best streaming device out there and gets bug fixes and feature enhancements even for way old builds.
rbanffy
The latest usually have updated software. As they stop being latest the support dwindles and, depending on reliance on proprietary code, so will your ability to maintain it yourself.
sliken
Cheaper? The DGX spark is $3,000 with 128GB ram. A framework desktop with the AMD strix halo 395 with 128gb ram is $2,000 and has better memory bandwidth. No price I can see for the identically spec'd (and nearly identical looking) Ascent gx10.

[edit] Oops, the Spark is $4,000, only the Ascent is at $3k now. Strix Halo systems vary from slightly slower (6%) to the same (on systems with LPDDR5x-8533, like the HP laptop).

walterbell
Includes RDMA 200GbE NIC.

> NVIDIA ConnectX-7 NIC these days often sells for $1500-2200 in single unit quantities, depending on the features and supply of the parts. At $2999 for a system with this buit-in that is awesome.

Yea unlike other options this is actually scalable. The question isn't if 1 can outperform the Mac Studio but if 3-4 linked together can.
Dylan16807
What kind of scaling do you have in mind there?

My naive analysis is: A high end Mac should be able to run each layer of an AI task about twice as fast because of the memory bandwidth. And the data going between layers is tiny enough to run over thunderbolt or even normal ethernet.

Is there an AI use case that prefers 250GB/s memory bandwidth plus 25GB/s interconnect over 500GB/s memory and 2GB/s interconnect? Are there other major use cases that prefer it?

fancyfredbot OP
Usually the reason you'd want the network bandwidth would be for distributed training.

For inference you can probably get by with 2GB/s assuming you can split the layers up nicely.

The interconnect can be a bottleneck for inference but only for networks with loads of activations and large batch sizes, or if you are doing tensor level parallelism.

sliken
Sort of. They come in different speeds, and those prices are for the 400gbe version. The 200gbe, like in the GX10 and spark are more like $1250. Not to mention you have to need that the 200gbe (to cluster 2 of them) and I'd expect the vast majority to buy a single unit, not a pair.
walterbell
With a crossover cable, a single unit could be used for local testing of software that depends on both RDMA and CUDA.
wtallis
Double check your memory bandwidth numbers. AMD's Strix Halo is 256 bits at 8000MT/s for about 238GB/s while NVIDIA's GB10 is 276GB/s (likely 256 bits at the more standard speed of 8533MT/s).
sliken
Depends which one, the HP ZBook Ultra uses LPDDR5x-8533, a dead match for what nvidia claims (273GB/sec). Although the DGX spark now costs $4,000 instead of the "starting at $2,999" mentioned a few months ago.

So the bandwidth is dead even between AMD and Spark.

wtallis
The HP ZBook Ultra is using DRAM parts rated for 8533MT/s but only operating at 8000MT/s, because 8000MT/s is the most the processor is rated for, but the memory manufacturers don't make parts with that non-stamdard speed grade.
sliken
Ah, thanks, I didn't know that.
rubatuga
Agreed not cheaper by a long shot.

This item has no comments currently.