[PD] Findings regarding performance

Mathieu Bouchard matju at artengine.ca
Thu Dec 1 17:09:07 CET 2011


Le 2011-12-01 à 15:24:00, Roman Haefeli a écrit :

> reason, let's just use an invented arbitrary unit for expressing the CPU
> time (ct) consumed by an object. It turned out that [gate~] uses 0.52ct
> when it is on and 0.4ct when it is off. But how much does [*~ ] use? No
> matter whether turned on or off, [*~ ] uses a stable 0.39ct.

Even though [switch~] does not use tight conditionals, instead checking 
only once per block, it still takes some time copying stuff and switching 
contexts. I suppose that this is the kind of thing that Pd could do more 
efficiently than it does now, if someone is brave enough to edit 
d_ugen.c... and knows how to do it.

Float multiplications are often so fast nowadays, that the process of 
reading and writing RAM is often bigger. A lot of machinery in the CPU is 
dedicated to multiplying floats as fast as possible.

> But the really interesting finding comes now. [*~ 0] has only 0.2ct!
> Almost the the ct value of a plain [*~ ] halved!

I already explained that on IRC. What part of the explanation was 
missing ? Here's a copy+paste from the chat.

« that's the difference between times_dsp and scalartimes_dsp... the 
latter is «obviously» faster... well, it does only ⅔ as much data 
transfer during the perform-function, so it would
make sense. benchmarks might say otherwise, or not. _but_ note that 
sending a float message to [*~] also means copying that value N times 
(N=block size... 64 or other) whereas in the scalartimes class such as [*~ 
42], sending a float (in right inlet) copies that value only once. »

and later :

« i did not verify it, but it does use twice as much input data. This 
predicts a possible 3/2 ratio, but there are other reasons why it could be 
a 2/1 ratio... it depends on the implementation of the cpu itself »

« when multiplications are fast, then the bottleneck is to get the data to 
move around, and if you have twice as much data, it can be twice slower. 
In some other situations, it can be even worse than twice slower. Those 
behaviours are harder to understand than ever because optimisation tricks 
pile up ever more. »

I mean optimisation tricks of CPUs and motherboards (caches, RAM chips, 
etc).

Are you receiving what I write on IRC ?

  ______________________________________________________________________
| Mathieu BOUCHARD ----- téléphone : +1.514.383.3801 ----- Montréal, QC


More information about the Pd-list mailing list