[PD] parallelism in pd

Karl MacMillan karlmac at peabody.jhu.edu
Wed Apr 17 16:38:51 CEST 2002

On Tue, 2002-04-16 at 20:10, Miller Puckette wrote:
> My understanding of this is all anecdotal (although I was heavily involved
> at the hardware level of all this a few years ago.)   The reason Intel
> processors are more likely to outrun available memory is that they have
> higher CPU "front side bus" bandwidth than AMD processors, hence the same
> memory system will be harder put to keep them happy.  I believe the
> current P4 has 3.2 GB/sec FSb bandwidth whereas DDR 2100 memory has 2.1 GB/sec,
> so unless you can keep your FSB idle 70% of the time your dual-processor-
> plus-DDR2100 system, for example, will be limited by memory bandwidth, and
> even in that case memory latency will be greater than for a uniprocessor
> since the two CPUs will have to queue for memory accesses.

The only advantage of the dual processor systems, however, is that you
get twice the cache. Depending on how much cache helps your particular
application and how well the OS keeps the same process on the same cpu,
you might get a boost from that. For some apps this might outweigh the
increased memory latency. Also, does SSE or SSE2 have any of the cache
control instructions like altivec? Altivec allows you to tell the cpu
fill the cache with data from main memory and the streaming continues
during interrupts and context switches (I think I remember that
correctly). Also, you can mark the data in the cache as least recently
used so that it gets flushed from the cache first. Pretty cool!

Miller, do you have any papers on how the dsp graphs were spread across
the processors in the ISPW? Was the parallelism automatic or did the
user have to do it explicitly?


> These considerations probably hold equally for threaded applications and
> for multiple process models.
> cheers
> Miller
> On Tue, Apr 16, 2002 at 03:32:09PM -0700, Andrew (Andy) W.  Schmeder wrote:
> > On Tue, 2002-04-16 at 10:24, Miller Puckette wrote:
> > > I concur, with a slight spin: Pd is often memory-bound, and most
> > > dual-processor systems, especially Intel based ones but also AMD,
> > > have their speed limited by memory bandwidth (which is shared between
> > > processors.)
> > 
> > In the interesting of maintaining rigor, I'd like to know if there is
> > any hard data to back up this claim.. i.e. profiling which demonstrates
> > the memory bandwidth versus CPU restrictions, etc on one or more
> > systems.
> > 
> > (To be fair I suppose this means that we need a standard regression
> > test/benchmark for a real-time audio system, PDSpec?)
> > 
> > The reason I ask is that I'm interested in massively multichannel
> > systems.
> > 
> > Also I am curious as to why Intel would have less memory bandwitdth...
> > intution suggests it would depend on the type of system memory used...
> > rambus versus SDRAM, etc?
> > 
> > 
> > -andy
Karl W. MacMillan                                 
Computer Music Department                        
Peabody Institute of the Johns Hopkins University
karlmac at peabody.jhu.edu                         

More information about the Pd-list mailing list