[PD] data structure question restated as speculation

Jonathan Wilkes jancsika at yahoo.com
Thu Sep 10 05:39:46 CEST 2015


Hi list,I never got a response on my previous request for a lesson on realloc, so I'm going to rankly speculate:* people who build patches that get/set/resize ds arrays using gpointers are going to encounter (and perhaps have already encountered) rare and difficult to reproduce crashes.
Why?For ds arrays, [pointer] can store a pointer to an array element.  The user can retrieve the value of the array element using [get], and set it using [set].  But the user can also resize the array using [setsize], and that can move the data to a different location than what [pointer] has stored.  (You can also change the ds array size with the mouse and the "Alt" key, but that's beside the point.)

So what's wrong with that?When you use [setsize] to resize a ds array, Pd calls "realloc" to do the resizing.  "realloc" is allowed to move the data to a new block of memory.  This is more likely to happen when choosing a large size for the array, but the OS could move the data if you shrink the array, too.  This is completely opaque to the user/programmer-- that is, there is no way to tell when or why realloc might relocate the data to another block.
For most purposes this is no big deal because "realloc" returns a pointer to the beginning of the block which can be assigned to a variable.  But recall above that we let our gpointer store a pointer to an array element which may be anywhere in the original block of memory.  If realloc relocated the chunk of memory for our ds array, when we try to [get] or [set] our element, we'll be getting/setting data using a pointer that points at the _old_ memory location.  In other words-- we're getting/setting values for a pointer to garbage!
What about "stale pointers"?When using [setsize], the refcounts on gpointers don't change, so the relevant pointers won't be flagged as stale.  But even if they were, that wouldn't do as it would be too conservative and invalidate the [setsize] object's pointer for the next size change.  (E.g., if you're feeding it with a number box.)
So why aren't there tons of crashes from this?OSes (well, at least Linux) apparently try extremely hard not to relocate the program's data when using realloc because it can slow things down.  But "try extremely hard" != "can never happen".

 So what now?I don't know.  Most stuff I've read would suggest storing offsets into arrays instead pointing at the elements.  But... this would get extremely complex with nested ds arrays-- you'd have to create some kind offset map for navigating from the gpointer back to the data you want to get at.  That seems like a lot of work, esp. for one of the most obscure parts of Pd.

Are you sure this is a problem?If it isn't then there is some serious C programming voodoo going on that has zero coverage on StackExchange.  But aside from Duff's Device I've seen about every kind of C voodoo in Pd so I honestly have no idea.
-Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-list/attachments/20150910/8dfe7d9d/attachment-0001.html>


More information about the Pd-list mailing list