zozbot234 parent
It depends, if the optimization is too hardware-dependent it might hurt/regress performance on other platforms. One would have to find ways to generalize and auto-tune it based on known features of the local hardware architecture.