[Zlib-devel] Performance patch set

Stefan Fuhrmann stefanfuhrmann at alice-dsl.de
Wed May 12 04:54:28 EDT 2010


Joakim Tjernlund wrote:
>
>> * memset for dist==1
>>   -> I tried that but it had no measurable effect
>>   because it was too rare. But I may just re-add it
>>   since the 'dist is too short' case is relatively rare
>>   in my case anyway. Adding checks there won't
>>   hurt the overall performance.
>>     
>
> It is not a real memset and it is valid for dist == 2 too. It is faster than going
> back to byte copies.
>   
Oops .. you are right. Coming weekend, I will do
some experimentation with bitmap compression
(i.e. long repeating patterns with dist 1 .. 8) and
see how much it can be improved.
> I do think this should be made in steps.
> Fist add my patch, add some other part and so on
> will be much easier to review a trouble shoot.
>   

Hm. This is my proposal:

* let Gilles and Mark review & test both our patches
  -> possibly identify some fundamental issues,
     or regressions with certain data patterns

* I will try your patch and compare it with QUICK_COPY
  on Core2 32 bit, and Core i7 64 bit. Try to merge ideas from
  both sides. Test also with bitmaps as a relevant special case.
  -> give you feedback from non-PPC platforms
  -> come up with a patch that comprises both our ideas

* You may test my patch (or parts thereof) on PPC (32 bit?)
  -> get an idea which of the individual improvements work
     well with your platform & use-case and which don't

* Come up with a revised patch inside 3 weeks or so.

Due to the large number of platforms and usages, we need to
give the changes a broad exposure to testers. My gut feeling
is that we can do that only once or twice per release. Thus,
I would like to see the patch applied in one go and only if
testers report issues, start slicing it up.

-- Stefan^2.





More information about the Zlib-devel mailing list