Avoid multiplication. This is a speedup of 20%.
This is nearly 4 times faster than the "portable" algorithm.