[PD] pduino rewrite

Ingo ingo at miamiwave.com
Fri Sep 16 14:02:01 CEST 2011

Hi Roman,

> Frankly, I'm not yet convinced that those little improvements in
> [arduino] will significantly improve the overall Pd performance.

Here's the reason why I started really to simplify any patch, no matter if
audio or control objects:

I have been programming for about 4 years on one single patch (fulltime -
only with breaks to get the hardware/OS going and sampling/editing sampled
You can imagine the amount of code that is in the patch by now.

When I started I thought it was very convenient to use wireless
[send/receive] objects to send midi data to the sample-voices (which it is).
At a certain point (about 2 years ago) the machine was completely
Then I measured that a EWI-USB wind controller can send up to 500 midi CC
messages per second. I had a function that could multiply the messages to
six different midi channels. That makes it 3,000 messages floating around.

The sample voices have at least 500 [receive] objects (there are close to
500 parameters per voice). There were 16 voices which adds up to 8,000
[receive] objects.

Sending 3,000 messages to 8,000 [receive] objects adds up to 24 million
times per second that the individual [receive] objects had to check whether
the message was meant to be for them or not.

That should be as much data shifting around only for checking [receive]
objects as it would take to move the data of several hundreds of audio
channels around.

The first fix was easy: assigning the parameter to receive from midi Ch01 if
voices are stacked. That cut the message transfer by 6. The second fix was
to replace the wireless sends with hard wired patch chords. That took care
of most of the rest. The machine was working again. Unfortunately this
second fix took 3-4 full months!!!

This is when I decided to think about efficiency in running mode first.
If you have a piece of code that has to check between 10 different options
and in a certain case only two options are available then it is worth it
copying the object and take out all unnecessary options. It's more work
while programming but it saves in this particular example several hundred
percent cpu time when running.

When such a programming style is used consistently I am sure you can get at
least double or more of the performance of a computer. Even with messages
where you would think they are not too heavy.


> -----Ursprüngliche Nachricht-----
> Von: Roman Haefeli [mailto:reduzent at gmail.com]
> Gesendet: Freitag, 16. September 2011 11:32
> An: Ingo
> Cc: 'Hans-Christoph Steiner'; pd-list at iem.at
> Betreff: Re: AW: AW: AW: [PD] pduino rewrite
> On Fri, 2011-09-16 at 05:57 +0200, Ingo wrote:
> > > The [change -1] is a great idea, I just committed that to bytemask.pd
> > > and debytemask.pd.  But the [pd resolve-bits_0-7] abstractions seem
> > > quite labor-intensive, but they work.  I think it would work better to
> > > use multiple instances of [debytemask].
> > >
> > > .hc
> >
> > Not sure what you mean by "labor-intensive", Hans. Are you talking about
> > manually changing 8 numbers per object (which took me less than 1 minute
> for
> > 56 channels) or are you talking about cpu processing?
> >
> > Which leads me to the next question: is the Boolean approach using [& 4]
> and
> > [>> 2] more cpu friendly than using [mod 8] and [div 4]?
> I was told that it is. Bit shifting and bit mask matching is supposed to
> be faster than integer division and modulo with an arbitrary (inclusive
> non-power-of-two integers).
> However, I can't tell you whether they are really faster in the real
> world. But you should be able to test it in your own setup with
> [realtime]. Start [realtime], let [mod 8]-[div 4] process 1 million
> numbers in 0 logical time, stop [realtime]. Do the same with a [& 4]-[>>
> 2] chain and compare the results.
> >  I don't know how Pd
> > handles such calculations and how it talks to the cpu. I'd be really
> very
> > interested to find out if there is a difference.
> >
> >
> > Since the pin numbers are predefined when you are using a [route] object
> to
> > sort out the groups I don't see the point why the pin number should be
> > calculated again (in this case of multiple instances). This is why I
> > hardcoded them into the message boxes.
> >
> > I put the two approaches next to each other to see how much simpler my
> > approach is object wise and calculation wise. Still with the question
> mark
> > which calculation method is more cpu friendly. Anyway changing [mod 8]
> and
> > [div 4] to [& 4] and [>> 2] shouldn't take more than a minute.
> You could also test the whole [pd digital message] subpatch with the
> above mentioned approach.
> Frankly, I'm not yet convinced that those little improvements in
> [arduino] will significantly improve the overall Pd performance. Using
> one less tilde object somewhere in your patch would save some order of
> magnitudes of CPU power more of what you ever will be able to squeeze
> out of the [arduino]. Message processing is usually so cheap compared to
> signal processing, that most often it's hardly worth to focus on the
> message processing part, unless you deal with message rates of several
> thousands per second. This is certainly not always true, but in my own
> experience it most often is.
> Roman

More information about the Pd-list mailing list