[PD] Findings regarding performance

Thu Dec 1 17:37:47 CET 2011

Hi Matju

> Are you receiving what I write on IRC ?

Not always, I think, but I remember the parts you posted below. 

My question ("Is there something more efficient to turn signals on and
off than [*~]?") and the things you replied triggered me to do those
performance tests. Somehow it is easier for me to deal with stuff that I
empirically experience than with written words in IRC where I am not
always sure, if I understand the full depth of their meaning. I'm sorry
now for not having credited you for bringing me to the topic.

Anyway, the fact that you're explaining stuff is not a reason for me to
not to make some tests. 

Howsoever, thanks for your explanations.

Roman

On Thu, 2011-12-01 at 11:09 -0500, Mathieu Bouchard wrote:
> Le 2011-12-01 à 15:24:00, Roman Haefeli a écrit :
> 
> > reason, let's just use an invented arbitrary unit for expressing the CPU
> > time (ct) consumed by an object. It turned out that [gate~] uses 0.52ct
> > when it is on and 0.4ct when it is off. But how much does [*~ ] use? No
> > matter whether turned on or off, [*~ ] uses a stable 0.39ct.
> 
> Even though [switch~] does not use tight conditionals, instead checking 
> only once per block, it still takes some time copying stuff and switching 
> contexts. I suppose that this is the kind of thing that Pd could do more 
> efficiently than it does now, if someone is brave enough to edit 
> d_ugen.c... and knows how to do it.
> 
> Float multiplications are often so fast nowadays, that the process of 
> reading and writing RAM is often bigger. A lot of machinery in the CPU is 
> dedicated to multiplying floats as fast as possible.
> 
> > But the really interesting finding comes now. [*~ 0] has only 0.2ct!
> > Almost the the ct value of a plain [*~ ] halved!
> 
> I already explained that on IRC. What part of the explanation was 
> missing ? Here's a copy+paste from the chat.
> 
> « that's the difference between times_dsp and scalartimes_dsp... the 
> latter is «obviously» faster... well, it does only ⅔ as much data 
> transfer during the perform-function, so it would
> make sense. benchmarks might say otherwise, or not. _but_ note that 
> sending a float message to [*~] also means copying that value N times 
> (N=block size... 64 or other) whereas in the scalartimes class such as [*~ 
> 42], sending a float (in right inlet) copies that value only once. »
> 
> and later :
> 
> « i did not verify it, but it does use twice as much input data. This 
> predicts a possible 3/2 ratio, but there are other reasons why it could be 
> a 2/1 ratio... it depends on the implementation of the cpu itself »
> 
> « when multiplications are fast, then the bottleneck is to get the data to 
> move around, and if you have twice as much data, it can be twice slower. 
> In some other situations, it can be even worse than twice slower. Those 
> behaviours are harder to understand than ever because optimisation tricks 
> pile up ever more. »
> 
> I mean optimisation tricks of CPUs and motherboards (caches, RAM chips, 
> etc).
> 

>   ______________________________________________________________________
> | Mathieu BOUCHARD ----- téléphone : +1.514.383.3801 ----- Montréal, QC