I believe their speedup is computed _assuming they can easily fix the correctness bugs in the kernels_.
In practice, with slight differences the model will feel almost lobotomized.
The article is referring to GPU compute kernel (https://en.wikipedia.org/wiki/Compute_kernel), not the term kernel used in ML/NN/etc.
…aren't they the same thing
They're not, but I also misunderstood the original question, they're referring to the correct definition of kernel. I thought they were confusing the GPU kernel with https://en.wikipedia.org/wiki/Kernel_method or https://en.wikipedia.org/wiki/Kernel_(image_processing)
wouldn't model not work properly if kernels are even slightly off?
wasn't kernels a part of training stack for models? am I missing anything?