[Zlib-devel] [7/6][RFC V2.1 Patch] Blackfin implementation
Jan Seiffert
kaffeemonster at googlemail.com
Fri Apr 8 11:10:35 EDT 2011
And again a big thanks to Mike Frysinger for testing grounds.
The first shot was not that broken, mainly the trailer handling.
Since fudging everything in place to make trailer handling happen in
vector mode takes lots of instructions do trailer handling sequential,
it's only 1-4 bytes.
I'm a little bit unhappy with the performance, but ENOREGISTER.
Every clever trick i can think of only makes it slower, because it
needs more register/accumulators, and even if stealing from the
non-DRegs, it means more moves, and there goes some cycle (move out of
DRegs and at some point back again) and prop. every time some stall.
The numbers:
An Blackfin BF537 at 500MHz
-------- orig ------
a: 0x0CB4B676, 10000 * 160000 bytes t: 19200 ms
a: 0x25BEB273, 10000 * 159999 bytes t: 19200 ms
a: 0x733CB174, 10000 * 159998 bytes t: 19200 ms
a: 0x1144AF76, 10000 * 159996 bytes t: 19190 ms
a: 0x3F4ECB8A, 10000 * 159992 bytes t: 24150 ms
a: 0x1902A382, 10000 * 159984 bytes t: 24160 ms
-------- vec ------
a: 0x0CB4B676, 10000 * 160000 bytes t: 10690 ms
a: 0x25BEB273, 10000 * 159999 bytes t: 10690 ms
a: 0x733CB174, 10000 * 159998 bytes t: 10700 ms
a: 0x1144AF76, 10000 * 159996 bytes t: 10680 ms
a: 0x3F4ECB8A, 10000 * 159992 bytes t: 10690 ms
a: 0x1902A382, 10000 * 159984 bytes t: 10670 ms
speedup: 1.796071
Putting vorder_o/e into different L1 brings that to 1.84, but since
that eats a precious resource for little gain, i left it out.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 07-bfin.patch
Type: text/x-patch
Size: 8257 bytes
Desc: not available
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110408/3f3aa839/attachment.bin>
More information about the Zlib-devel
mailing list