You could argue it's all the nice GPU debugging tools nVidia provides which makes GPU programming accessible.
There are so many potential bottlenecks (normally just memory access patterns, but without tools to verify you have to design and run manual experiments).
There are so many potential bottlenecks (normally just memory access patterns, but without tools to verify you have to design and run manual experiments).