[Zlib-devel] Parallel zlib

Mark Adler madler at alumni.caltech.edu
Tue Feb 16 02:14:14 EST 2010


On Feb 15, 2010, at 10:24 PM, devsk wrote:
> Can we at least start to discuss what a parallel API may look like?

Sure.

Off the top of my head, there are a few approaches depending on the control desired.

1.  Add deflateParallel(strm, nthreads, blocksize) to be called after deflateInit().  When deflate() is provided input, it fires off up to nthreads separate threads, one for each blocksize of input data.  When deflate() is provided output space, it waits for the next thread in sequence to finish and provides that compressed data.  Perhaps there could be a new flush mode for input without waiting for output.  This approach has the downside of deflate having to copy all of the input and buffer lots of output on its own, possibly duplicating the application's input and output buffers and using lots of memory.

2.  Add parallel wrapper routines on top of the existing deflate, without changing the existing deflate routines.  (Also avoids linking threads library when not using threads.)  Allow the application to fire off threads on its own, wait for them to complete when it wants to wait, and manage its own buffer space.  A routine would be provided to recombine the output and compute the trailer information (crc or adler32).  Maybe something like:

    pool = deflatePool(nthreads, deflate parameters) -- initialize a pool of threads for use by deflateLaunch, creating up to nthreads threads as needed and reusing them

    id = deflateDive(pool, inbuf, inlen, dict, dictlen, last, outbuf, &outlen, &lastbits, howcheck, &check) -- start a raw compression job, may use dictionary, may compute check value

    deflateRescue(pool, id) -- wait for job id to finish, then can consume output and check value and can reuse buffers

    gluestate = deflateGlue(gluestate, outlen, lastbits, glue, &gluelen, howcheck &check) -- append output by providing glue bytes (independent of pool), first call with NULL gluestate initializes state

    deflateEmpty(pool) -- wait for all jobs to finish and release all thread resources

    deflateDrown(pool) -- kill any running jobs and release all thread resources

This has the downside that the user has to keep track of the jobs, to not mess with the buffers it provided until the jobs are done, and to correctly assemble the output.  I can already imagine all the bug reports resulting from pilot error ...

3.  Do #2 above, and then add a wrapper around *that* to provide the simplified functionality of #1.

Mark





More information about the Zlib-devel mailing list