[Zlib-devel] pigz 2.1.4 output differs for -p 1 and -p 2

wpilorz at gmail.com wpilorz at gmail.com
Sun Nov 16 18:02:06 EST 2008


On Sat, Nov 15, 2008 at 01:40:30PM -0800, Mark Adler wrote:
> On Nov 13, 2008, at 5:24 PM, Mark Adler wrote:
> >This behavior is odd.  In theory, Z_FULL_FLUSH should erase any  
> >memory of the previous data, yet somehow the size of the next block  
> >is different when using Z_FULL_FLUSH as opposed to deflateReset().   
> >deflateReset() is doing a "better" job of erasing the memory of the  
> >previous compression.  This may in fact be a bug in deflate in zlib.
> 
> 
> I found where that difference was coming from.  Below is a patch to  
> deflate to correct the problem.
> 
> Now I'll look into remedies for the differences due to  
> deflateSetDictionary(), i.e. when not using -i.
> 
> Mark
> 
> 
> *** zlib-1.2.3.3/deflate.c	2006-09-03 23:57:22.000000000 -0700
> --- zlib-1.2.3.4/deflate.c	2008-11-15 13:13:58.000000000 -0800
> ***************
> *** 847,852 ****
> --- 847,854 ----
>                    */
>                   if (flush == Z_FULL_FLUSH) {
>                       CLEAR_HASH(s);             /* forget history */
> +                     if (s->lookahead == 0)
> +                         s->strstart = 0;
>                   }
>               }
>               flush_pending(strm);
> 
> 
I have applied two patches to zlib 1.2.3 (the one above and that patching lines 332-339 of deflate.c),
compiled zlib and linked pigz (compiled with -O3) to the resulting static libz.a.

For some input pigz with options -n -T -i -p 1 generated invalid data (tests
run on CentOS 5 Linux, PIII i386 CPU):

$ perl -we 'use strict; for my $i (1 .. 99_999) { my $imod= $i % 101; my $itx=""; for (1 .. $imod) { $itx .= "X". ($i^$_); }; print "Line_$i : $itx\n"}' | md5sum -b;
2d74cda70d7653466d7072d07563e55d *-

$ for op in '-i -p 1' '-i -p 2'; do perl -we 'use strict; for my $i (1 .. 99_999) { my $imod= $i % 101; my $itx=""; for (1 .. $imod) { $itx .= "X". ($i^$_); }; print "Line_$i : $itx\n"}' | /usr/local/pigz_081116/pigz  -n -T $op | md5sum -b; done
627ae1f31111d3db9c7901a3d135b71d *-
f49f22039d758be59b6fe766d2c5f154 *-

$ for op in '-i -p 1' '-i -p 2'; do perl -we 'use strict; for my $i (1 .. 99_999) { my $imod= $i % 101; my $itx=""; for (1 .. $imod) { $itx .= "X". ($i^$_); }; print "Line_$i : $itx\n"}' | /usr/local/pigz_081116/pigz  -n -T $op | zcat | md5sum -b; done

zcat: stdin: invalid compressed data--crc error

zcat: stdin: invalid compressed data--length error
c9cccdd56734dc1c48e85fbb3be40ad4 *-
2d74cda70d7653466d7072d07563e55d *-


I have patched plain zlib-1.2.3, is that OK, or are there any other patched which should also be applied?
(The directory names in your patches might suggest there have been some other patches applied to zlib-1.2.3)

Also, I have disk data (about 285 MB) which generates valid but different results when compressed through
pigz -6 -n -T -p 1
and 
pigz -6 -n -T -p 2

I will try to find something that could be sent via email for that.
(PS. that data also gives invalid  strem (not accepted by zcat)
 when compressed with /usr/local/pigz_081116/pigz -6 -n -T -i -p 1
)






More information about the Zlib-devel mailing list