[Zlib-devel] [0/4][RFC Introduction] Adler32 vectorization

Jan Seiffert kaffeemonster at googlemail.com
Mon Mar 14 21:14:36 EDT 2011


Back in April 2010 there was a post with code to vectorize adler32 on x86.
Unfortunately, as far as public visible, this went nowhere.

I am looking into vectorization of adler32 for myself since 2009.
In my use case (inflating very sparse bitfields) adler32 used a good
part of total zlib time (and zlib needed more CPU than my program).
Over the time i developed some nice tricks how to deal with adler32,
and i developed code for:
* x86
* ARM
* PPC Altivec
* mips (loongson + DSP ASE)
* ia64
* alpha
and even sparc VIS/VIS2 (but it's slower than the serial code...)

Since vectorization of adler32 can achieve enormous gains (> 6 times
speedup seen) not using SIMD units available on Processors large (x86,
POWER7) and small (ARM, mips) is a shame (for numbers please refer to
the patches).

So today i want to send some patches as a test balloon to get the ball rolling.

Special care has been taken to make the integration with zlib code very simple.
No extra tools needed, no earth shattering changes to the build system.
The platform code is simply selected by the preprocessor directly from
the C-Code and the platform code
itself is in C, but often with the use of inline ASM (this narrows the
use somewhat to certain compiler (GCC & compatible), but on the other
hand makes the code available without any voodoo dance).
If this is unacceptable (at least for Altivec and NEON it seems to be
the way to go), i hope together we can work on a solution.

This series against zlib 1.2.5 contains:
* a prep. patch
* PPC Altivec code
* ARM NEON & ARMv6 code
* x86 code

Any review would be greatly appreciated.

The other code is left out of this first RFC to not clutter it up and
because it is only compile tested.
Apropos: The ARM code is also untested, i was hoping someone with an
ARM at hand could give it a try.

Greetings
Jan




More information about the Zlib-devel mailing list