[Zlib-devel] Mark Adler, where are you?
Jan Seiffert
kaffeemonster at googlemail.com
Thu Jul 19 20:36:13 EDT 2012
Daniel Richard G. schrieb:
> On Thu, 19 Jul 2012, Jonathan Nieder wrote:
>
>> Perhaps one way to save Mark time: have you reviewed each other's
>> code?
>
> I'd be happy to do that if he were busy these days and he asked. (I
> wouldn't be able to do much with Andreas' patch, because I don't have
> access to an S/390, but that's practice versus the principle.
But you could test if it regresses on other architectures (maybe important
to you) then s/390 ;)
>
> Absent his request, however, unsolicited opinions from people he
> doesn't know and trust are going to be of little value to him, and
> not a good use of time for us.
>
Hmmm, i wouldn't call tests "opinions"
> In any event, all I want to know right now is whether Mark's still
> around and due to return. Him being too busy to keep up with
> zlib-devel is not the only possible explanation for his ongoing
> absence.
>
All i know is Mark is always very silent.
He's probably busy with daily business (shooting stuff into space or whatever his
job is nowadays).
Some reflections on the patches:
Longest match - Longest match is a tough nut to optimize. As a SIMD addict,
i bit my teeth out on it. The problem is that it is really at
the core of the deflate algo, and it it's performance
characteristic is crucial for overall performance. But since most
matches are small and "wildly" aligned, you have to find the
right balance of light loops and raw power big loops.
Funnily compiler do a good job at it, prop. because zlib is part of
the SPEC suite. You can only win with really nitty gritty
hand rolled scalar ASM, which has the risk to regress on the
next µArch release... Except maybe your Arch has an instruction
_exactly_ for this job (and the instruction is cool, "rep cmps"
isn't)
A patch that only gives small improvements in the unaligned_ok
case (which you should set if your arch can do so) on some fringe
arch (sorry S/390) with a regression on Core2 which he can not
explain, hmmmm.
longest match is ATM at kinda sweat spot for the general upstream
distribution. Touching it can only make it worse somewhere.
I can see why Mark is silent.
(i guess the problem is he makes some vars uLong, which makes them
64 bit, and the Core2 is AFAIR not the greatest 64Bit processor. Suggestion:
#ifdef HAVE_STDINT_H
# include <stdint.h>
typedef uint_fast32_t uFastInt;
#else
typedef unigned int uFastInt;
#endif
and use that. But i see that the glibc <stdint.h> defines uint_fast32_t to
unsigned long on 64Bit x86, so it would be no win. Maybe
#ifdef HAVE_STDINT_H
# include <stdint.h>
# ifdef __x86_64__
typedef uint32_t uFastInt;
# else
typedef uint_fast32_t uFastInt;
# endif
#else
typedef unigned int uFastInt;
#endif
)
Your valgrind cleanup - Here the "problem" is you are doing an (often) 128kb
memset (if memory serves me right). And that probably
for nothing.
If you look at deflate.c at 191 (around #define CLEAR_HASH)
it says that prev[] will be initialised on the fly.
This could be wrong, but i doubt it. Valgrind can sometimes
be very tricky. Or it may be an idiom Valgrind can't understand
like "Yes, the code does read uninitialised data, but we do
nothing with it". (cmp. deflate.c at 1358, btw, did you compile
with FASTEST or not?)
It would have been good to include the error output from valgrind.
So in essence: Stay calm. Maybe refine your patches.
>
> --Daniel
>
>
Greetings
Jan
--
yeah mechanical engineering. It's like math but louder!!
More information about the Zlib-devel
mailing list