Comment by imtringued

imtringued Mar 28, 2024 parent

The actual problem is that nobody uses these low precision floats for training their models. When you do quantization you are merely compressing the weights to minimize memory usage and to use memory bandwidth more efficiently. You still have to run the model at the original precision for the calculations so nobody gives a damn about the low precision floats for now.

Y_Y Mar 29, 2024

That's not entirely true. Current-gen Nvidia hardware can use fp8 and newly announced Blackwell can do fp4. Lots of existing specialized inference hardware uses int8 and some int4.

You're right that low-precison training still doesn't seem to work, presumably because you lose the smoothness required for SGD-type optimization.

This item has no comments currently.