[Zlib-devel] patch-in-progress: vectorized adler32 calculation
Gilles Vollant
info at winimage.com
Wed Apr 14 13:59:13 EDT 2010
" I'm currently not decided whether to support MMX at all. It would only be
used on processors older than 5 years, would get little test coverage and
yields only a modest performance improvement."
I suggest you support even older processor (without MMX), without
performance consideration, just to not hang
-----Message d'origine-----
De : zlib-devel-bounces at madler.net [mailto:zlib-devel-bounces at madler.net] De
la part de Stefan Fuhrmann
Envoyé : mercredi 14 avril 2010 01:39
À : zlib-devel at madler.net
Objet : Re: [Zlib-devel] patch-in-progress: vectorized adler32 calculation
Hi all,
thanks for all the feedback. I helped greatly to make up my mind
on how to proceed.Not to spam the list, I would like to answer
in a single post.
I will port the code to C with intrinsics - probably next weekend.
That solves a number of issues:
* single source for ICC, GCC and MSVC
* widely uniform x86 / x64 code
* no hassle with different ABIs (x64)
* included in the standard build procedure
Detection the CPU features at runtime is necessary to allow for
generic binaries (e.g. certified svn server installers) using the
respective optimal code.
The MSVC deficiencies shown in the link posted by Török seem
to be limited to initialization code (setting data). As part of the
detection phase, all necessary data structures can be prepared
once, so MSVC's issues won't hurt the following runs.
I'm currently not decided whether to support MMX at all. It would
only be used on processors older than 5 years, would get little test
coverage and yields only a modest performance improvement.
All performance figures were gained from actual measurements
and indicate even slightly better numbers than the 15%->5%
improvement. Due to the dependency on buffer sizes, compression
rates etc., I posted somewhat more conservative figures.
The overall performance of the deflate() function is already quite
impressive: For real-world repository data, I measured ~200MB
inflated data per sec. Buffer sizes etc.seem to be o.k. (a few kB
on average) and changing the granularity would be very hard to
do anyways.
-- Stefan^2.
_______________________________________________
Zlib-devel mailing list
Zlib-devel at madler.net
http://mail.madler.net/mailman/listinfo/zlib-devel_madler.net
More information about the Zlib-devel
mailing list