[Zlib-devel] infnew-5 available for testing

Chris Anderson christop at fellspt.charm.net
Sat Jan 18 02:30:01 EST 2003


>
> I hacked up the gcc asm output of infnew-5/inffast.c pretty good, to see
> what I could get out of it.
>
> zbuflen  16384, clock 9.180, time 9.214
> zbuflen  16384, clock 9.150, time 9.190

Alignment of next_in on short boundary makes a pretty big difference:

zbuflen  16384, clock 8.620, time 8.657
zbuflen  16384, clock 8.610, time 8.640

Before entry to do loop, add

        /* align in_r on word boundary */
        testl   $1, in_r
        jz      .L_is_word_aligned
        xorl    %eax, %eax
        movb    (in_r), %al
        incl    in_r
        movb    bits_r, %cl
        addb    $8, bits_r
        shll    %cl, %eax
        orl     %eax, hold_r

.L_is_word_aligned:

Because of this, I believe inflate is fighting bandwidth constraints.
Need bigger registers on the next_in side.  On the next_out side, there
may be improvements to be had by organizing reads and writes along cache
lines.





More information about the Zlib-devel mailing list