tintor parent
Consumer-grade hardware? Even at 4bits per param you would need 500GB of GPU VRAM just to load the weights. You also need VRAM for KV cache.
It's MoE-based, so you don't need that much VRAM.
Nice if you can get it, of course.
Nice if you can get it, of course.