[Zlib-devel] infnew-5 available for testing
Chris Anderson
christop at fellspt.charm.net
Sat Jan 18 02:30:01 EST 2003
>
> I hacked up the gcc asm output of infnew-5/inffast.c pretty good, to see
> what I could get out of it.
>
> zbuflen 16384, clock 9.180, time 9.214
> zbuflen 16384, clock 9.150, time 9.190
Alignment of next_in on short boundary makes a pretty big difference:
zbuflen 16384, clock 8.620, time 8.657
zbuflen 16384, clock 8.610, time 8.640
Before entry to do loop, add
/* align in_r on word boundary */
testl $1, in_r
jz .L_is_word_aligned
xorl %eax, %eax
movb (in_r), %al
incl in_r
movb bits_r, %cl
addb $8, bits_r
shll %cl, %eax
orl %eax, hold_r
.L_is_word_aligned:
Because of this, I believe inflate is fighting bandwidth constraints.
Need bigger registers on the next_in side. On the next_out side, there
may be improvements to be had by organizing reads and writes along cache
lines.
More information about the Zlib-devel
mailing list