[Zlib-devel] [FYI] More zlib SIMD goodness
Jan Seiffert
kaffeemonster at googlemail.com
Thu May 12 15:33:14 EDT 2011
Looking at other possible victims for some SIMD goodness i stumbled
over fill_window, more concrete the code to slide the hashes.
Lo and behold, the whole code there is nothing but a subtraction with
unsigned saturation, same as needed for a lot of pixel manipulation.
So there is a app^Winstruction for that.
With a little reorg of the adler32 stuff, and a little splitout of the
hash sliding code, we can get some nice speedups.
The code lives atm at
https://github.com/kaffeemonster/zlib_adler32_vec
in the "slhash" branch
For those who like numbers, compressing some $LARGE_FILE (ok, except
on ARM, low disc space there...) with minigzip:
============ orig ===============
32.50 real
33.55 real
33.51 real
========== altivec ==============
29.12 real
29.29 real
28.88 real
~12.5% speedup
============ orig ===============
user 0m4.060s
user 0m4.160s
user 0m4.140s
============ NEON =============
user 0m3.490s
user 0m3.520s
user 0m3.660s
~16% speedup
============ orig ===============
user 0m4.000s
user 0m4.000s
user 0m4.150s
========= ARM v6 DSP ============
user 0m3.780s
user 0m3.730s
user 0m3.920s
~7.2% speedup
============ orig ===============
18.90 real
18.38 real
18.48 real
============ sse2 ===============
16.69 real
16.61 real
16.38 real
~12.8% speedup
Greetings
Jan
More information about the Zlib-devel
mailing list