[Zlib-devel] [7/6][RFC V2.1 Patch] Blackfin implementation
Mike Frysinger
vapier at gentoo.org
Fri Apr 8 16:15:19 EDT 2011
On Friday, April 08, 2011 15:21:44 Jan Seiffert wrote:
> 2011/4/8 Mike Frysinger <vapier at gentoo.org>:
> > the 2nd one does a lot of byte loads, so i wonder if a few more insns but
> > with 16bit (or even 32bit) loads would in practice speed things up.
>
> The second loop is the trailer handling, doing 1 to 4 byte max. So: i
> doesn't matter.
yeah, ok, you're right of course. that'd be a drop in the bucket as the main
loop would dominate everything else.
> > the core often times is running at 5x or 6x the speed of system devices,
> >so any external
> >
> > memory i/o is probably going to dominate the stalls.
>
> Yes, prop.
> Unfortunately i don't think the CPU can reorder instructions (esp.
> loads) so one can fetch the next 4 Byte while doing calc. on the last
> 4 Byte.
> Or am i wrong?
there is no insn reordering in the Blackfin architecture as doing so explodes
silicon size ... which is bad for embedded. it does have an interlocked
pipeline so that loads/stores are "backgrounded", and the stall doesnt occur
until the result is actually used (or the result is available in which case
there is no stall).
/* A 32bit fetch is put onto the bus and R0 is marked */
R0 = [P0];
... do some stuff without R0 in the pipeline ...
/* If the load has not yet completed, we stall here */
R0 += 1;
the Blackfin PRM explains this bit of magic starting at "Load/Store Operation"
on page 6-68.
-mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part.
URL: <http://madler.net/pipermail/zlib-devel_madler.net/attachments/20110408/8d633542/attachment.sig>
More information about the Zlib-devel
mailing list