[PD] [PD-dev] what's the maximum list size?

Sat Jul 6 00:48:19 CEST 2019

 > On Thursday, July 4, 2019, 07:52:05 PM EDT, Christof Ressi <christof.ressi at gmx.at> wrote:
 
 >>  > how big of a list xcan we have in Pd?
>> 100

> That's only the internal limit for stack allocation in the [list] methods.

There's no easy way to measure the worst case performance for malloc/free, much less predict the worst cases on other supported platforms of Pd.
So if low latency realtime scheduling is desired, then the most sane way to achieve that is to limit the number of malloc/free calls to as few as possible during DSP computation (or if it's not a DSP-based patch, then during the normal runtime of the patch after all loadbangs have been fired).
That means always taking the stack allocation path for list classes. The way to do  that is to limit list output sizes to a maximum of 100 elements.
Also notice that any copying of lists requires a loop to copy each element of the list to the new list (regardless of how the target list got allocated). So for maximally  predictable runtime performance, try not to use any list objects that require copying lists. (This is why the crazy recursive list-abs/list-drip abstraction is performant.)
>> > How many elements can a t_atom have?
>> 1000

> What does this even mean?

It means evaluating a Pd message with greater than 1000 elements in a row which are neither commas or semis will trigger a malloc/free pair each time the message gets evaluated. So for low latency realtime performance scheduling, don't do that during normal patch runtime.
>> For best results, no bigger than:
>> 5

> Everything below 100 elements is equally fine (no heap allocation)

Five elems and below avoids an alloca call by just assigning to a 5 element array "smallstack" in binbuf_eval. I haven't measured so I have no idea if that's a premature optimization, though.
Also note: binbuf_eval already has an #ifdef for simply allocating for the entire message size, and the comment mentions decreased performance for whatever tests were run. I'm sure it would be worse performance for loading patches as those would almost be the worst possible case: large number of args where semis break the message up into fairly regular small chunks that would typically fit into an alloca or even a smallstack (e.g., "#X obj 20 20 bang;").
Random digression-- why doesn't t_binbuf just cache some annotations about itself like
* count for its longest nested message (e.g., contiguous region with no semis or commas* gpointer count* whether its a constant message (e.g., do we have dollar signs)
It appears every way to populate a t_binbuf already loops through the elements, *and* that they allocate memory there. Seems like users would see a performance increase by caching at least the maxnargs in the t_binbuf itself to avoid a full loop through the binbuf in binbuf_eval.
Constant message annotation could be nice, too. If a message box was just [1 2 3 4 5( then the message could just do an end run around the message evaluation. (Of course that would add a branch for every message that isn't constant, so it'd need to be measured...)
-Jonathan  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-list/attachments/20190705/37a47e87/attachment.html>