[PD] data structure question restated as speculation

Miller Puckette msp at ucsd.edu
Thu Sep 10 06:40:04 CEST 2015


Ouch - this is a crasher bug - thanks for pointing it out.  As you say,
in linux it's hard to hit this but I think it would be very easy to on
a macintosh - I'm surprised it hasn't surdaced earlier.

My original intention was simply to invalidate all pointers into the aray
when it is resized (in the same way as all pointers into a ist are invalidated
if anything is deleted from the list).  I apparently simply forgot to put
that in so will do so now, and hope that it doesn't break anything that can't
be easily fixed.

I was just at teh point of releasing a batch of bugfixes to 0.46 so this
comes at an opportune time.

cheers
Miller

On Thu, Sep 10, 2015 at 03:39:46AM +0000, Jonathan Wilkes via Pd-list wrote:
> Hi list,I never got a response on my previous request for a lesson on realloc, so I'm going to rankly speculate:* people who build patches that get/set/resize ds arrays using gpointers are going to encounter (and perhaps have already encountered) rare and difficult to reproduce crashes.
> Why?For ds arrays, [pointer] can store a pointer to an array element.  The user can retrieve the value of the array element using [get], and set it using [set].  But the user can also resize the array using [setsize], and that can move the data to a different location than what [pointer] has stored.  (You can also change the ds array size with the mouse and the "Alt" key, but that's beside the point.)
> 
> So what's wrong with that?When you use [setsize] to resize a ds array, Pd calls "realloc" to do the resizing.  "realloc" is allowed to move the data to a new block of memory.  This is more likely to happen when choosing a large size for the array, but the OS could move the data if you shrink the array, too.  This is completely opaque to the user/programmer-- that is, there is no way to tell when or why realloc might relocate the data to another block.
> For most purposes this is no big deal because "realloc" returns a pointer to the beginning of the block which can be assigned to a variable.  But recall above that we let our gpointer store a pointer to an array element which may be anywhere in the original block of memory.  If realloc relocated the chunk of memory for our ds array, when we try to [get] or [set] our element, we'll be getting/setting data using a pointer that points at the _old_ memory location.  In other words-- we're getting/setting values for a pointer to garbage!
> What about "stale pointers"?When using [setsize], the refcounts on gpointers don't change, so the relevant pointers won't be flagged as stale.  But even if they were, that wouldn't do as it would be too conservative and invalidate the [setsize] object's pointer for the next size change.  (E.g., if you're feeding it with a number box.)
> So why aren't there tons of crashes from this?OSes (well, at least Linux) apparently try extremely hard not to relocate the program's data when using realloc because it can slow things down.  But "try extremely hard" != "can never happen".
> 
>  So what now?I don't know.  Most stuff I've read would suggest storing offsets into arrays instead pointing at the elements.  But... this would get extremely complex with nested ds arrays-- you'd have to create some kind offset map for navigating from the gpointer back to the data you want to get at.  That seems like a lot of work, esp. for one of the most obscure parts of Pd.
> 
> Are you sure this is a problem?If it isn't then there is some serious C programming voodoo going on that has zero coverage on StackExchange.  But aside from Duff's Device I've seen about every kind of C voodoo in Pd so I honestly have no idea.
> -Jonathan

> _______________________________________________
> Pd-list at lists.iem.at mailing list
> UNSUBSCRIBE and account-management -> http://lists.puredata.info/listinfo/pd-list




More information about the Pd-list mailing list