[Zlib-devel] zlib ideas for 1.2.5 (fwd)
Vincent Torri
vtorri at univ-evry.fr
Wed Feb 24 01:44:17 EST 2010
Here are some ideas from Carsten Haitzler (author of the enlightenment
windows manager)
Vincent Torri
---------- Forwarded message ----------
Date: Wed, 24 Feb 2010 17:35:47 +1100
From: "Carsten Haitzler (The Rasterman)" <raster at rasterman.com>
To: vtorri at univ-evry.fr
Subject: zlib ideas
ok. one thing i do use zlib "dumbly" for is image data. As such i dont massage
this data before i throw it at zlib - i dont want to bother with a "massaging"
stage (making a copy of it in a different format to make spatially close pixels
be close in linear space) as this just adds a stage and overhead to
compression/decompression - yes, even if the files end up bigger. So... what
might be useful for these kinds of uses - and others is to be able to hint to
zlib that the data comes in a format it could make use of when hunting for data
to put into the dictionaries.
1. thing would be - from byte A for B bytes, data comes in rows of C bytes and
balues are of D bytes (1, 2, 4 etc.) each. ie:
........
A111
2222
3333
444B
......
that would imply that values in row 1 "line up" with values in line 2, line 3
etc. and that there likely is common data between rows. (in images spatially
near pixels will tend to have similar values - sometimes the same value. zlib
can just ignore this info if it likes, or make use of it when compressing by
reading source data in a different order (eg read a value then read values to
the right, below and diagonally to the bottom-right i.e.:
[V1][V2][..........]<-row1
[V3][V4][..........]<-row2
etc.)
of course thats the simple one - grouping int blocks of 4 values. you can
extend this into:
[V1][V2][V5][Va]
[V3][V4][V6][Vb]
[V7][V8][V9][Vc]
[Vd][Ve][Vf][Vg]
for a block of 4x4 (linearising the compression order to
V1,2,3,4...9,a,b,c...f,g). Of course only for when columns and rows are
multiples of 4 - the non multiple of 4 (or 2 as above) can be chopped off and
handles as a partial block.
Anyway - maybe this is asking zlib to become more than it should be. My aim is
not maximum compression - its good compression with absolute minimum overhead.
having zlib look at its input data a different way is a way of doing this. :)
Also another thing - being able to say that data ranges in the input data have
limits on their value ranges may help you? eg in many cases for a quartet of
bytes (v1,v2,v3,v4) v2 <= v1 && v3 <= v1 && v4 <= v4 - ALWAYS. would it help
zlib to know these things?
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) raster at rasterman.com
More information about the Zlib-devel
mailing list