>> For your speed comparisons, what compiler did you use on what kind of system? > The numbers were collected on a x86_64 Linux, 3.5.0, system with GCC > 4.7.2. Please specify the family and model of the chip as given by CPUID. The cache sizes, data cache latency, and other CPU implementation parameters actually do matter. --