[Zlib-devel] zlib gzopen_w function added

jbowler at acm.org jbowler at acm.org
Sat Mar 17 12:54:37 EDT 2012


Mark wrote:
>That is exactly what I was concerned about in the first approach 
>(always do a UTF-8 to UTF-16 conversion in gzopen when on Windows),
>hence why I tried to duplicate what appears to be the standard approach in Windows. 

You got the correct solution.  Windows apps are either legacy apps that deal with all the different single and multi-byte code pages that existed before the Unicode Consortium or they are post Unicode (post 1996-2000 in practice) and they use the "W" APIs and work in WCHAR.

The (a) (UTF-8) approach, while it seems attractive at first sight, doesn't work because most Windows app programmers, just like most Linux programmers, have no idea what code page/character set encoding the file names they get are using, they just get them and pass them back to the operating system.

As a result while you might think gzopen() takes a UTF-8 encoded file name that's not what is happening; on both Windows and Linux alike gzopen() actually gets a file name in the current codepage() (or whatever) of the operating system on which it runs.  Just so long as zlib doesn't parse it then it is safe.  Converting to other forms is not safe as a result, including converting to a supposedly unambiguous form like UNICODE.

The only thing I would suggest is that you might want to do what windows.h does and throw a big #define switch to select the wide version of gzopen if the wide APIs are selected by the app.  (Most apps never call fooA or fooW, they just call foo and get the right A or W interface.)

John Bowler <jbowler at acm.org>





More information about the Zlib-devel mailing list