<span name="katja" class="gD"></span>Excuse my ignorance:<br>not sure how to start the below version of pd on the rpi?<br><br>I have the full path but then what?<br><br>if I do (in command line)<br>pd /place/where/new/pd/is/bin/pd<br>

It signals watchdog.<br><br>I also still have regular pd 0.44.0 installed btw.<br><br>Sorry if this is dumb dumb dumb dumb Duuummmbbb.<br><br>Jb<br><br><div class="gmail_quote">On 24 January 2013 09:14, katja <span dir="ltr">&lt;<a href="mailto:katjavetter@gmail.com" target="_blank">katjavetter@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">&#39;Undenormalized&#39; Pd build for Raspberry Pi is temporarily parked here<br>

for testing purposes (will be removed when Miller&#39;s release is fixed<br>

in this sense):<br>

<br>

<a href="http://www.katjaas.nl/temp/pd-0.44-0-normalized.tar.gz" target="_blank">www.katjaas.nl/temp/pd-0.44-0-normalized.tar.gz</a><br>

<br>

This is a locally installed Pd, like Miller&#39;s distribution. You can<br>

start it from command line with the full path to<br>

pd-0.44-0-normalized/bin/pd. It&#39;s not a .deb, so it can&#39;t be installed<br>

under supervision of package manager.<br>

<span class="HOEnZb"><font color="#888888"><br>

Katja<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

<br>

On Wed, Jan 23, 2013 at 9:15 PM, Julian Brooks &lt;<a href="mailto:jbeezez@gmail.com">jbeezez@gmail.com</a>&gt; wrote:<br>

&gt; Hey Katja,<br>

&gt;<br>

&gt; Would you mind sharing the &#39;normalised&#39; Pd-0.44.0 for RPi please.<br>

&gt;<br>

&gt; Cheers,<br>

&gt;<br>

&gt; Julian<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; On 23 January 2013 18:23, katja &lt;<a href="mailto:katjavetter@gmail.com">katjavetter@gmail.com</a>&gt; wrote:<br>

&gt;&gt;<br>

&gt;&gt; Now I recompiled the Pd-0.44.0 release on Raspberry Pi (took me a few<br>

&gt;&gt; hours, not only because Pi is so slow) with PD_BIGORSMALL enabled for<br>

&gt;&gt; arm in m_pd.h. Using bigorsmalltest.pd from my previous mail I<br>

&gt;&gt; verified that the macro is implemented indeed.<br>

&gt;&gt;<br>

&gt;&gt; Martin Brinkmann&#39;s patch chaosmonster1<br>

&gt;&gt; (<a href="http://www.martin-brinkmann.de" target="_blank">http://www.martin-brinkmann.de</a>) gives a beautiful illustration of the<br>

&gt;&gt; improvement. This patch is full of filters and delay lines. At it&#39;s<br>

&gt;&gt; initial settings, there is no subnormals problem. But if you set the<br>

&gt;&gt; bottom slider to the right, it gets silent. With Pd-0.44-0 release,<br>

&gt;&gt; CPU load explodes. With the &#39;normalized&#39; Pd, nothing special happens.<br>

&gt;&gt;<br>

&gt;&gt; And indeed, the PD_BIGORSMALL conditional checks come for free: with<br>

&gt;&gt; initial settings of the chaosmonster1, performance is equivalent in<br>

&gt;&gt; both Pd&#39;s. Cool! Hopefully this is similar on armv7.<br>

&gt;&gt;<br>

&gt;&gt; Katja<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; On Wed, Jan 23, 2013 at 5:01 PM, Hans-Christoph Steiner &lt;<a href="mailto:hans@at.or.at">hans@at.or.at</a>&gt;<br>

&gt;&gt; wrote:<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; hey Katya,<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; This also sounds like good evidence for your idea of writing C code that<br>

&gt;&gt; &gt; modern compilers optimize well.  Using unions for aliasing allows the<br>

&gt;&gt; &gt; compiler<br>

&gt;&gt; &gt; to do all the new tricks, then writing loops that auto-vectorize gives<br>

&gt;&gt; &gt; us the<br>

&gt;&gt; &gt; real benefits.  Also, I think we can see some gains by using memcpy()<br>

&gt;&gt; &gt; since on<br>

&gt;&gt; &gt; modern libc version, those are highly optimized for the given CPU,<br>

&gt;&gt; &gt; dynamically<br>

&gt;&gt; &gt; choosing the routines based on what instructions are available. memcpy<br>

&gt;&gt; &gt; will<br>

&gt;&gt; &gt; use things like SSSE2 if its available.<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; .hc<br>

&gt;&gt; &gt;<br>

&gt;&gt; &gt; On 01/23/2013 07:47 AM, katja wrote:<br>

&gt;&gt; &gt;&gt; Finally some good news on this topic. Earlier I stated that &#39;big or<br>

&gt;&gt; &gt;&gt; small tests&#39; are expensive for the Pi, but that is not by definition<br>

&gt;&gt; &gt;&gt; the case. There must have been other conditions blurring my<br>

&gt;&gt; &gt;&gt; impression. I&#39;ve now done a systematic test where other influences are<br>

&gt;&gt; &gt;&gt; ruled out. A test class [lopass~] with exactly the same routine as<br>

&gt;&gt; &gt;&gt; [lop~] was made, but compiled with PD_BIGORSMALL() macro enabled. It<br>

&gt;&gt; &gt;&gt; was verified that [lopass~] is not affected by denormals. Performance<br>

&gt;&gt; &gt;&gt; comparison of [lop~] and [lopass~] shows that both objects cause<br>

&gt;&gt; &gt;&gt; equivalent CPU load. Meaning, Raspberry Pi gives the &#39;big or small<br>

&gt;&gt; &gt;&gt; checks&#39; for free! At least in the case of this simple filter. Please<br>

&gt;&gt; &gt;&gt; try attached bigorsmalltest.zip on the Pi to see if I&#39;m not dreaming.<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; While I was at the topic anyway, I also tried a big or small test with<br>

&gt;&gt; &gt;&gt; union instead of direct type aliasing. It has the advantage that the<br>

&gt;&gt; &gt;&gt; compiler can apply strict aliasing rules. This test with unions did<br>

&gt;&gt; &gt;&gt; not cause extra CPU load either on the Pi. If you want to verify this<br>

&gt;&gt; &gt;&gt; result, enable the call to bigorsmall() instead of PD_BIGORSMALL in<br>

&gt;&gt; &gt;&gt; lopass~.c and recompile.<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; The fact that these tests do not cause extra CPU load, indicate that<br>

&gt;&gt; &gt;&gt; they are done in parallel with other instructions. Float and int<br>

&gt;&gt; &gt;&gt; registers are apparently strictly separated on armv6, there&#39;s no such<br>

&gt;&gt; &gt;&gt; thing like Intel&#39;s xmm registers or armv7&#39;s NEON. As it happens, the<br>

&gt;&gt; &gt;&gt; big or small tests are done on ints, aliases of the floats that must<br>

&gt;&gt; &gt;&gt; be tested. Initially I assumed that the transport of floats from vfp<br>

&gt;&gt; &gt;&gt; to the arm integer processor would be expensive, but if the<br>

&gt;&gt; &gt;&gt; instructions are done simultaneously it may be an advantage instead.<br>

&gt;&gt; &gt;&gt; Another thing is that ARM implements branch predication instead of<br>

&gt;&gt; &gt;&gt; branch prediction. Those terms look almost the same but the routines<br>

&gt;&gt; &gt;&gt; are very different. Predication is when instructions for both branches<br>

&gt;&gt; &gt;&gt; are executed, and the wrong result is simply discarded later.<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; Conclusions from the limited test with [lop~] and [lopass~] do not<br>

&gt;&gt; &gt;&gt; mean that all sorts of conditional checks are cheap on the Pi, or on<br>

&gt;&gt; &gt;&gt; ARM in general. If PD_BIGORSMALL is enabled for RPi using compile-time<br>

&gt;&gt; &gt;&gt; definition __arm__, it will also hold for armv7, but it may have very<br>

&gt;&gt; &gt;&gt; different result there. At the moment I have no access yet to an armv7<br>

&gt;&gt; &gt;&gt; device. Maybe someone can recompile test class [lopass~] and do the<br>

&gt;&gt; &gt;&gt; tests on Beagleboard or Cubieboard? Otherwise I may be able to do it<br>

&gt;&gt; &gt;&gt; on my friend&#39;s PengPod when that has arrived.<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; Katja<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt;<br>

&gt;&gt; &gt;&gt; On Tue, Jan 22, 2013 at 8:54 PM, Miller Puckette &lt;<a href="mailto:msp@ucsd.edu">msp@ucsd.edu</a>&gt; wrote:<br>

&gt;&gt; &gt;&gt;&gt; thanks - I&#39;d better try this and find out what&#39;s going on :)<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt; M<br>

&gt;&gt; &gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt; On Mon, Jan 21, 2013 at 11:54:29AM +0100, katja wrote:<br>

&gt;&gt; &gt;&gt;&gt;&gt; Tried the 0.44.0 build from your website. It has the same issue with<br>

&gt;&gt; &gt;&gt;&gt;&gt; subnormal values. My test patch is with [lop~]. If inf or nan is fed<br>

&gt;&gt; &gt;&gt;&gt;&gt; into [lop~], these &#39;values&#39; keep circulating in the object, it can no<br>

&gt;&gt; &gt;&gt;&gt;&gt; longer process normal signal values.<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; I also tried my reverb stuff with specific compiler options for Pi&#39;s<br>

&gt;&gt; &gt;&gt;&gt;&gt; processor:<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; -march=armv6zk<br>

&gt;&gt; &gt;&gt;&gt;&gt; -mcpu=arm1176jzf-s<br>

&gt;&gt; &gt;&gt;&gt;&gt; -mtune=arm1176jzf-s<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; With these options, gcc should be able to decide that RunFast mode is<br>

&gt;&gt; &gt;&gt;&gt;&gt; permitted. But even in combination with -ffast-math (which in turn<br>

&gt;&gt; &gt;&gt;&gt;&gt; sets -funsafe-math-optimizations and -fno-trapping-math amongst<br>

&gt;&gt; &gt;&gt;&gt;&gt; others), denormals are still there. I&#39;m literally out of options for<br>

&gt;&gt; &gt;&gt;&gt;&gt; the moment. Sorry for not having better news.<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt; Katja<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt; &gt;&gt;&gt;&gt;<br>

&gt;&gt;<br>

&gt;&gt; _______________________________________________<br>

&gt;&gt; <a href="mailto:Pd-list@iem.at">Pd-list@iem.at</a> mailing list<br>

&gt;&gt; UNSUBSCRIBE and account-management -&gt;<br>

&gt;&gt; <a href="http://lists.puredata.info/listinfo/pd-list" target="_blank">http://lists.puredata.info/listinfo/pd-list</a><br>

&gt;<br>

&gt;<br>

</div></div></blockquote></div><br>