[Zlib-devel] Performance patch set
Stefan Fuhrmann
stefanfuhrmann at alice-dsl.de
Wed May 12 04:54:28 EDT 2010
Joakim Tjernlund wrote:
>
>> * memset for dist==1
>> -> I tried that but it had no measurable effect
>> because it was too rare. But I may just re-add it
>> since the 'dist is too short' case is relatively rare
>> in my case anyway. Adding checks there won't
>> hurt the overall performance.
>>
>
> It is not a real memset and it is valid for dist == 2 too. It is faster than going
> back to byte copies.
>
Oops .. you are right. Coming weekend, I will do
some experimentation with bitmap compression
(i.e. long repeating patterns with dist 1 .. 8) and
see how much it can be improved.
> I do think this should be made in steps.
> Fist add my patch, add some other part and so on
> will be much easier to review a trouble shoot.
>
Hm. This is my proposal:
* let Gilles and Mark review & test both our patches
-> possibly identify some fundamental issues,
or regressions with certain data patterns
* I will try your patch and compare it with QUICK_COPY
on Core2 32 bit, and Core i7 64 bit. Try to merge ideas from
both sides. Test also with bitmaps as a relevant special case.
-> give you feedback from non-PPC platforms
-> come up with a patch that comprises both our ideas
* You may test my patch (or parts thereof) on PPC (32 bit?)
-> get an idea which of the individual improvements work
well with your platform & use-case and which don't
* Come up with a revised patch inside 3 weeks or so.
Due to the large number of platforms and usages, we need to
give the changes a broad exposure to testers. My gut feeling
is that we can do that only once or twice per release. Thus,
I would like to see the patch applied in one go and only if
testers report issues, start slicing it up.
-- Stefan^2.
More information about the Zlib-devel
mailing list