Comment by terhechte - Hacker Neue

Would this also be possible with other LLM engines / GPUs? E.g. Llama / Apple Silicon or Radeon?

saagarjha May 28, 2025

Yeah, none of this is specific to CUDA (though the relative latencies might be different).

This item has no comments currently.