Comment by fc417fc802

fc417fc802 5 days ago parent

> ASICs could optimize things like the ReLU operations, but modern GPUs already have logic and instructions for matrix multiplication and other operations.

Right but at that point you're describing an H100 plus an additional ASIC plus presumably a CPU and some RAM. Or a variant of an H100 with some specialized ML functions baked in. Both of those just sound like a regular workstation to me.

Inference is certainly cheaper but getting it running quickly requires raw horsepower (thus wattage, thus heat dissipation).

Regarding CPUs there's a severe memory bandwidth issue. I haven't kept track of the extreme high end hardware but it's difficult to compete with GPUs on raw throughput.

This item has no comments currently.