llm inference is fine on rocm. llama.cpp and vllm both have very good rocm support.
llm training is also mostly fine. I have not encountered any issues yet.
most of the cuda moat comes from people who are repeating what they heard 5-10 years ago.
llm training is also mostly fine. I have not encountered any issues yet.
most of the cuda moat comes from people who are repeating what they heard 5-10 years ago.
For instance, we experimented with AWS Inferentia briefly, but the value prop wasn't sufficient even for ~2022 computer vision models.
The calculus is even worse for SOTA LLMs.
The more you need to eke out performance gains and ship quickly, the more you depend on CUDA and the deeper the moat becomes.