[Zlib-devel] [9/8][RFC V3 Patch] SPARC VIS implementation

Jan Seiffert kaffeemonster at googlemail.com
Wed May 4 18:12:31 EDT 2011


Who said one can not build a faster adler32 with this instruction set.
Wait, that was me...

So here it is, the SPARC VIS version of adler32.

Numbers on an TI UltraSparc II
         -------- orig ------
                a: 0x0CB4B676, 10000 * 160000 bytes     t: 9080 ms
                a: 0x25BEB273, 10000 * 159999 bytes     t: 9590 ms
                a: 0x733CB174, 10000 * 159998 bytes     t: 9580 ms
                a: 0x1144AF76, 10000 * 159996 bytes     t: 10600 ms
                a: 0x3F4ECB8A, 10000 * 159992 bytes     t: 12090 ms
                a: 0x1902A382, 10000 * 159984 bytes     t: 12100 ms
         -------- vec ------
                a: 0x0CB4B676, 10000 * 160000 bytes     t: 5250 ms
                a: 0x25BEB273, 10000 * 159999 bytes     t: 5250 ms
                a: 0x733CB174, 10000 * 159998 bytes     t: 5250 ms
                a: 0x1144AF76, 10000 * 159996 bytes     t: 5250 ms
                a: 0x3F4ECB8A, 10000 * 159992 bytes     t: 5250 ms
                a: 0x1902A382, 10000 * 159984 bytes     t: 5260 ms
         speedup: 1.729524
seen 1.8 on an UltraSparc IIIi

The code is not automatically enabled, you have to specify HAVE_VIS,
for several reasons:

- Do not use it with Niagara or other CPUs like it. They have a
  shared FPU (T1: 1 FPU for all up to 8 cores, T2: 1 FPU for 8
  threads)
  and to make matters worse the code does not seem to work there
  (even same binary which creates correct results on other SPARC
  creates wrong result on T1)
- There is no clear preprocesor define which tells us if we have VIS.
  Often the tool chain even disguises a sparcv9 as a sparcv8
  (pre-UltraSPARC) to not confuse the userspace.
- The code has a high startup cost, so it is better not used with
  NO_DIVIDE && NO_SHIFT
- The code only handles big endian

We can not easily provide a dynamic runtime switch. The CPU has make
and model encoded in the Processor Status Word (PSW), but reading
the PSW is a privileged instruction (same as PowerPC...)

Pushed to git.

Greetings
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 09-sparc.patch
Type: text/x-patch
Size: 13875 bytes
Desc: not available
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110505/1a6ecd67/attachment.bin>


More information about the Zlib-devel mailing list