[PD] what makes Pd-extended 0.43 so CPU-hungry?

katja katjavetter at gmail.com
Fri Dec 7 22:36:15 CET 2012


Finally I have some clue what's wrong with Pd-E 0.43 for GNU/Linux, or
for Debian Squeeze at least. Sorry that it took me so long to sit down
and sort it out.

The problem is still there, with version 0.43.4: my live performance
setups run with almost double CPU load, when compared to 0.42. Now I
also tested with some comprehensive patches which are known to be pure
vanilla, like Martin Brinkmann's 'chaosmonster'. Remarkably, these
patches do not show an increased CPU load. Therefore I guessed that it
must be in external classes.

I tried using callgrind and kcachegrind (thanks for the hint Jamie).
Though callgrind makes Pd choke completely (while recording the
complete call history of a process instead of taking samples), the
output gave a clue. Freeverb~ was shown to make a couple hundred
function calls within the perform loop. Functions which are written as
'inline' in the C file. An isolated freeverb~ instance turned out to
do 10% CPU load. Admittedly, this computer (1.8 GHz core duo 2006) is
not the latest. But freeverb~ normally does some 1% per instance.

So, freeverb~ is the messenger; without it I might not have noticed
any problem. But what is the message? Is Pd-E 0.43 compiled without
optimization? I searched for more inline functions in external libs,
and found one in bsaylor/svf~. In this case again, the executable
implements it as a call. The core code however is almost certainly
compiled with properly inlined functions. There's one frequently
called inline function in the API (PD_BIGORSMALL, which used to be a
macro in the past). If this would be compiled as a call, a patch like
'chaosmonster' would definitely show performance loss.

Note that I'm talking about debian binaries so far, more precisely
Pd-E 0.43.4 for debian squeeze, as downloaded from puredata.info
downloads page. In contrast, I checked freeverb~ in the distribution
for OSX i386, and here the inlining was done properly.

Another difference between those distributions: SSE instructions are
used for OSX, not for debian. Simple operations like addition and
multiplication of floats are done on the FPU in debian, while xmm
registers are used with OSX. This also means that things like abs()
and ifnan() are function calls for debian, while they could be simple
instructions on the xmm registers. (Instructions can be viewed by
dissassembling executables with command objdump -d <file>.)

My conclusion from these observations: at least some Pd 0.43 externals
for debian squeeze are compiled with -0O for some reason (don't know
about other Linuxes). How come? The template makefile (also used for
freeverb~) has optimization -O6. The root makefile for the packages
have certain optimization flags as well. Are they somehow conflicting,
producing an undefined result? Not for OSX, apparently. But for debian
something goes wrong. The build system stuff is really over my head,
hopefully someone else has better overview to find the exact cause.

Katja




On 5/6/12, Jamie Bullock <jamie.b.bullock at gmail.com> wrote:
>
> Hi Katja,
>
>
> On 5 May 2012, at 20:43, katja <katjavetter at gmail.com> wrote:
>
>>
>>
>> I've tried to use Oprofile on Debian, but this gives me a kernel
>> failure soon as I start sampling. Does anyone know of a fine
>> performance profiler for GNU/Linux?
>>
>> Katja
>>
>>
>
> You might want to try callgrind + kcachegrind...
>
> http://www.slac.stanford.edu/BFROOT/www/Computing/Optimization/genprof.html
>
> best,
>
> Jamie
>
> --
> http://www.jamiebullock.com
>
>
>



More information about the Pd-list mailing list