[PD] Raspberry Pi does denormals

Wed Jan 23 13:47:42 CET 2013

Finally some good news on this topic. Earlier I stated that 'big or
small tests' are expensive for the Pi, but that is not by definition
the case. There must have been other conditions blurring my
impression. I've now done a systematic test where other influences are
ruled out. A test class [lopass~] with exactly the same routine as
[lop~] was made, but compiled with PD_BIGORSMALL() macro enabled. It
was verified that [lopass~] is not affected by denormals. Performance
comparison of [lop~] and [lopass~] shows that both objects cause
equivalent CPU load. Meaning, Raspberry Pi gives the 'big or small
checks' for free! At least in the case of this simple filter. Please
try attached bigorsmalltest.zip on the Pi to see if I'm not dreaming.

While I was at the topic anyway, I also tried a big or small test with
union instead of direct type aliasing. It has the advantage that the
compiler can apply strict aliasing rules. This test with unions did
not cause extra CPU load either on the Pi. If you want to verify this
result, enable the call to bigorsmall() instead of PD_BIGORSMALL in
lopass~.c and recompile.

The fact that these tests do not cause extra CPU load, indicate that
they are done in parallel with other instructions. Float and int
registers are apparently strictly separated on armv6, there's no such
thing like Intel's xmm registers or armv7's NEON. As it happens, the
big or small tests are done on ints, aliases of the floats that must
be tested. Initially I assumed that the transport of floats from vfp
to the arm integer processor would be expensive, but if the
instructions are done simultaneously it may be an advantage instead.
Another thing is that ARM implements branch predication instead of
branch prediction. Those terms look almost the same but the routines
are very different. Predication is when instructions for both branches
are executed, and the wrong result is simply discarded later.

Conclusions from the limited test with [lop~] and [lopass~] do not
mean that all sorts of conditional checks are cheap on the Pi, or on
ARM in general. If PD_BIGORSMALL is enabled for RPi using compile-time
definition __arm__, it will also hold for armv7, but it may have very
different result there. At the moment I have no access yet to an armv7
device. Maybe someone can recompile test class [lopass~] and do the
tests on Beagleboard or Cubieboard? Otherwise I may be able to do it
on my friend's PengPod when that has arrived.

Katja

On Tue, Jan 22, 2013 at 8:54 PM, Miller Puckette <msp at ucsd.edu> wrote:
> thanks - I'd better try this and find out what's going on :)
>
> M
>
> On Mon, Jan 21, 2013 at 11:54:29AM +0100, katja wrote:
>> Tried the 0.44.0 build from your website. It has the same issue with
>> subnormal values. My test patch is with [lop~]. If inf or nan is fed
>> into [lop~], these 'values' keep circulating in the object, it can no
>> longer process normal signal values.
>>
>> I also tried my reverb stuff with specific compiler options for Pi's processor:
>>
>> -march=armv6zk
>> -mcpu=arm1176jzf-s
>> -mtune=arm1176jzf-s
>>
>> With these options, gcc should be able to decide that RunFast mode is
>> permitted. But even in combination with -ffast-math (which in turn
>> sets -funsafe-math-optimizations and -fno-trapping-math amongst
>> others), denormals are still there. I'm literally out of options for
>> the moment. Sorry for not having better news.
>>
>> Katja
>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bigorsmalltest.zip
Type: application/zip
Size: 47531 bytes
Desc: not available
URL: <http://lists.puredata.info/pipermail/pd-list/attachments/20130123/bcd22eaa/attachment-0001.zip>