[PD] SMP Questions

Wed Aug 22 01:45:29 CEST 2007

On 8/21/07, Tim Blechmann <tim at klingt.org> wrote:
> > > to make use of a multicore machine the only way to utilize all cores
> > is
> > > to run several instances of pd, that are connected via jackdmp.
> > >
> >
> > Now *there's* an idea.  Would that really work?  What would be the
> > downside -- aside from the memory needed to run multiple copies of
> > PD?
>
> the problems are:
> - scalability: you need (at least) as many pd instances as cpu cores...
> it is always the question, if you can manually split your dsp graph in a
> reasonable way ...
> - performance: jackdmp's dsp graph scheduling is less efficient than
> pd's (which is less efficient than nova's :) ... so using _many_ pd
> instances is probably a bad idea
> - communication overhead: you need to synchronize the instances ... easy
> for simple controls (OSC or netsend/receive) difficult for shared
> resources (buffers, busses)

one additional problem: some algorithms are exclusively serial....
This is a problem that some scientists face when they bring their
matlab code to run on a cluster.  They write the algorithms in serial,
then they expect to have it perform faster.  The programmer has to
know serial vs parallel programming techniques, and when parallelism
is possible or not.

>
> > I can imagine a very powerful modular system built on this model.

> however, i was thinking about ways to implement a hybrid system with
> automatic segmentation of the dsp graph into parallel dsp chains that
> can be scheduled with a dataflow algorithm ... but it would require lots
> of performance tests to tweak the heuristics of the graph
> segmentation ... for now, i had neither time nor funding ... (but maybe
> it is an interesting topic for my master thesis?)

Agreed, it is an interesting topic.
But maybe a generic (applies to all multi-processor systems) solution
is not the best way to go.  How about just concerning yourself with
one instance, one specific set of hardware (for example, see the
Storm-1 DSP from Stream Processing, or (cheaper) a quad core Intel).
That would be significant, by itself.

One of the limitations with the Pd DSP chain *is* it's style of
modularity.  The stream is broken down into indivisible blocks.  The
tree is parallel at the top, but as you go down the tree, it becomes
more and more serial.  There would be a bottleneck, where the parallel
processes aren't used.
In order to get a generic speedup, those "indivisible blocks" have to
be divisible.  And this is not always possible--

note: not complainin'--hhh--just like to be aware that trying to "make
parallel" a software built for serial calculations is a lot more work
than it's worth.  You'd have to start almost from scratch to design an
ideal parallelized Pd.

Chuck

> tim
>
> --
> tim at klingt.org    ICQ: 96771783
> http://tim.klingt.org
>
> Nothing exists until or unless it is observed. An artist is making
> something exist by observing it. And his hope for other people is that
> they will also make it exist by observing it. I call it 'creative
> observation.' Creative viewing.
>   William S. Burroughs
>
> _______________________________________________
> PD-list at iem.at mailing list
> UNSUBSCRIBE and account-management -> http://lists.puredata.info/listinfo/pd-list
>
>
>