[Zlib-devel] [7/6][RFC V2.1 Patch] Blackfin implementation
Jan Seiffert
kaffeemonster at googlemail.com
Fri Apr 8 17:32:54 EDT 2011
2011/4/8 Mike Frysinger <vapier at gentoo.org>:
> On Friday, April 08, 2011 15:21:44 Jan Seiffert wrote:
>> 2011/4/8 Mike Frysinger <vapier at gentoo.org>:
[snip]
>> > the core often times is running at 5x or 6x the speed of system devices,
>> >so any external
>> >
>> > memory i/o is probably going to dominate the stalls.
>>
>> Yes, prop.
>> Unfortunately i don't think the CPU can reorder instructions (esp.
>> loads) so one can fetch the next 4 Byte while doing calc. on the last
>> 4 Byte.
>> Or am i wrong?
>
> there is no insn reordering in the Blackfin architecture as doing so explodes
> silicon size ... which is bad for embedded. it does have an interlocked
> pipeline so that loads/stores are "backgrounded", and the stall doesnt occur
> until the result is actually used (or the result is available in which case
> there is no stall).
>
> /* A 32bit fetch is put onto the bus and R0 is marked */
> R0 = [P0];
> ... do some stuff without R0 in the pipeline ...
> /* If the load has not yet completed, we stall here */
> R0 += 1;
>
> the Blackfin PRM explains this bit of magic starting at "Load/Store Operation"
> on page 6-68.
Nice, but, hmmm.
Tried that by moving the loads 4 instructions before use, helped, but
not very much (1.7 -> 1.9). For that i had to use another loop, like i
envisioned this afternoon, but after adding all the n-- needed its
slower then the loop in the patch (1.7 -> 1.5).
Argh, i can only parallel with an Ireg -= Mreg, but i can't mac by an Ireg.
Two MACs, nice, but if you can't feed them...
I must be missing something.
> -mike
>
Greetings
Jan
--
Murphy's Law of Combat
Rule #3: "Never forget that your weapon was manufactured by the
lowest bidder"
More information about the Zlib-devel
mailing list