[PD-dev] how to dinamically allocate t_atom & t_float size?

Jonathan Wilkes jancsika at yahoo.com
Sat Dec 5 22:54:23 CET 2020

> Don't use alloca() unless you have a good reason, know how it works internally and are aware of all its problems :-).
I was thinking of a good Pd inside joke: never optimize anything:
"When interacting with your patch, Pd is designed to walk linked lists as much as possible. The spikes that result train the patch author to design the program with maximum audio processing efficiency, squeezing every last bit of performance out of the little CPU time that remains for the user." :)

 On 05.12.2020 12:54, Christof Ressi wrote:

Your concerns are certainly warranted.
But if the function is 
a) never called recursively,
b) is never inlined(!)* and
c) the buffer size is guaranteed not to exceed some reasonable size for stack allocation (say a few hundred bytes),
then alloca() is IMO the best tool for the job. It is not only simpler but also faster than pre-allocation, because the stack memory is much more likely to be in cache.
Again, I wouldn't recommend using alloca() for general purpose programming, but when used cautiously it can be a great tool for real-time code.
BTW, here's an example for how alloca() might be used in a production-grade library: https://github.com/gcp/opus/blob/master/celt/stack_alloc.h
It uses alloca() resp. VLAs, but can fallback to a pseudo stack for platforms without alloca()/VLA support or small stack sizes (e.g. embedded devices).
*) IIRC, GCC refuses to inline functions if they contain alloca() statements. What can go wrong with inlined functions containing alloca(): https://stackoverflow.com/a/3410689/6063908
 On 04.12.2020 16:57, Jonathan Wilkes wrote:
 > On Friday, December 4, 2020, 9:43:20 AM EST, Christof Ressi <info at christofressi.com> wrote:  
  > alloca() "allocates" memory on the stack. This is done by simply incrementing the stack pointer. So it's extremely fast and - more importantly - equally fast for all sizes. 
  It also requires you, the programmer, to add up the total number of possible allocations in a recursive call to the object, for supported platforms with small default stack sizes, and doing the math in your head to ensure your algorithm will never go over that stack limit, for all possible cases. Since that stack limit is many orders of magnitude smaller than the RAM on the most popular Pd platform, you're way more likely to accidentally cause crashes for your users by relying on alloca.
  Again, the ATOMS_ALLOCA macro in x_list.c is the careful, thoughtful reference for use of alloca. And even in that case there are almost certainly multiple ways to cause a crasher from it. In other words, it's nearly impossible to use alloca safely. 
  Don't use alloca *unless* you have made worst case measurements on every other algorithm you can think of, and none of them are satisfactory. Even then, *measure* alloca to be worst-case safe and write regression tests for the recursive edge cases that could blow the stack. Chances are when you consider doing that extra work, you'll quickly think up a different algorithm that doesn't rely on alloca.    
> malloc(), on the other hand, actually uses the system memory allocator which can take arbitrarily long and might even block!
 > Generally, you should avoid using any malloc() in real-time code paths. Instead, pre-allocate temporary buffers (e.g. in the "dsp" method) or allocate on the stack (but note the caveats mentioned in the other mails). 
  If you can't, ask on the list how to use alloca without blowing the stack. Once you crowdsource a truly safe algorithm, write tests so you catch the crashers that the crowd missed the first time around.
  To be clear-- I basically grepped for "alloca" in the current codebase, opened up x_list.c, and *assumed* because alloca is tricky that there is somehow a crasher bug. It took about 5 minutes to come up with a case that should crash on Windows. 
  I also see it in m_binbuf.c, and I'd make the same bet it can blow the Windows stack if someone spends five minutes looking at the code. For a common building block of realtime safe algos, I shouldn't be able to make claims like these for any use of alloca I happen to find. 
  I keep harping on this because the default description of alloca makes it sound like the quintessential building block of realtime algos. Please weigh that alluring set of seeming realtime safe benefits against the history of realtime unsafe crashers of which will likely include your use alloca.
> Christof
  On 04.12.2020 03:28, Alexandre Torres Porres wrote:
     I'm using getbytes and freebytes for now, any disadvantages over alloca? 
  Em qui., 3 de dez. de 2020 às 20:59, David Rush <kumoyuki at gmail.com> escreveu:
  On Thu, 3 Dec 2020 at 23:15, Alexandre Torres Porres <porres at gmail.com> wrote:
 Hi, when compiling ELSE for camomille in windows, me and Esteban are getting some errors. Offending pieces of code are when trying to do things like  
   t_atom at[ac];   
  If you want to maintain straight C compiler compatibility 
          t_atom* at = (t_atom*)malloc(ac * sizeof(t_atom)); 
  but you have to remember to free(at), &cet. You can avoid the free() if you HAVE_ALLOCA with 
           t_atom* at = (t_atom*)alloca(ac * sizeof(t_atom));  
  if you want to do it the C++ way without a std::vector<t_atom> 
          t_atom* at = new t_atom[ac]; 
  but again you will have to 
          delete at; 
  For my own externals, I write them all in C++ and use STL. Making the change from the C-world allocation of PD to the C++ world is not so hard, but it does involve a tiny bit of trickery which I only justify through expediency. 
  - d   
