[PD] Concatenating two atoms into one?

Maximiliano Estudies maxiestudies at gmail.com
Sun Oct 4 12:16:19 CEST 2020


To add to the great explanations already given by Johannes and Christoph,
Miller gave a seminar last spring quarter called "Inside Pd" where he
explains the inner workings of Pd in great detail.
Here is the course site:
http://msp.ucsd.edu/syllabi/206.20s/index.htm
and here the videos:
http://msp.ucsd.edu/syllabi/206.20s/movies/
In the video 2b.apr10.mp4
<http://msp.ucsd.edu/syllabi/206.20s/movies/2b.apr10.mp4> he explains the
symbol storage strategy.

I'm very grateful to be able to access this content, thank you Miller!

El jue., 1 oct. 2020 a las 13:21, Christof Ressi (<info at christofressi.com>)
escribió:

> > your computer memory will define the time when it will crash Pd (it will
> > crash, when all the strings in the symboltable eat up all the memory
> > available)
> I think eating up all available memory is not a likely scenario on
> modern computers with a 64-bit address space and virtual memory.
>
> The actual problem is that Pd's symbol table is implemented as a hash
> table with seperate chaining
> (https://en.wikipedia.org/wiki/Hash_table#Separate_chaining). This means
> that symbols which have the same hash value (= hash collision) are
> connected as a linked list. If you look up a such a symbol, you have to
> walk the linked list and do a string comparison for each element until
> you find a match.
>
> The more symbols you add, the more hash collisions you get and the more
> symbols end up in the same bucket. In practice, this means that for a
> large number of elements, insertion and lookup becomes more and more
> expensive because the linked lists for each bucket grow larger and
> larger. While the hash lookup takes constant time "O(1)", the linked
> list traversal takes linear time "O(N)" (and is not very cache friendly).
>
> Good hash table implementations mitigate this problems by a technique
> called "rehashing" where the hash table array is resized if the "load
> factor" (number of elements / number of buckets) exceed a certain
> threshold (= "load factor"). Then every item is hashed again and
> inserted back into the new hash table.
>
> Unfortunately, Pd doesn't do any rehashing and it is certainly possible
> to significantly slow down Pd by flooding the symbol table. A common
> load factor threshold is 0.75, but if you create 10 000 unique symbols
> in Pd, the load factor would be ~10! Not very good...
>
> In fact, I've already been thinking about improving the symbol table.
>
> Christof
>
> On 01.10.2020 10:28, IOhannes m zmoelnig wrote:
> > On 2020-10-01 09:54, oliver wrote:
> >> IOhannes m zmoelnig wrote:
> >>> On 2020-10-01 09:22, hans w. koch wrote:
> >>>> but be aware of the risks of invoking makefilname all too often.
> >>> note that if you use dollsyms (as in `[$1$2(`) you are filling up the
> >>> symbol table just as well.
> >> i was just about to ask, if the attached modified patch would avoid that
> >> problem, but you replied already.
> >>
> >> could you please clarify the used term "invoke" a bit ?
> >> i guess the number of [makefilename] objects isn't the problem, but how
> >> much/often its conversion mechanism is used, right ?
> > yes (the latter).
> >
> >> does that mean that everytime a number->symbol conversion happens
> >> (regardless how it is done) the symboltable is filled and will
> > no.
> > everytime a *new* symbol is created.
> > the point of symbols (vs ordinary strings) is, that a single literal
> > only needs to be stored once.
> > so if you first create a string "rubadub" (however you do this), a new
> > entry for the symbol 'rubadub' is created.
> > now, if you concatenate the symbols 'rubad' and 'ub', the result is
> > "rubadub", which already happens to be in the symbol table (and thus no
> > new entry needs to be generated).
> > for Pd these strings are *identical*.
> > this is cool as we can really easily compare the two strings. if they
> > occupy the same entry in the symbol table (which basically means, that
> > Pd gets the same pointer for when turning the literal into a symbol),
> > then the two strings are the same.
> > so rather than having to compare each character of the string
> > "sjfdjdasjfsfjrueincru057894_curtrfenr3ewf8354j3wp57jp3" with each
> > character of "sjfdjdasjfsfjrueincru057894_curtrfenr3ewfB354j3wp57jp3" ,
> > Pd only needs to compare two pointers - and this can be done in a single
> > step on your CPU.
> >
> > the problem with generating symbol programmatically (e.g. by sending
> > numbers to [makefilename %d]) is, that it is so super easy to generate
> > lots and lots of (different) symbols.
> >
> >
> >> eventually slow down or crash PD ?
> >>
> >> so, as a live example: writing number values to GUI labels dynamically
> >> is a potentially dangerous thing ? what's the threshold there ?
> > your computer memory will define the time when it will crash Pd (it will
> > crash, when all the strings in the symboltable eat up all the memory
> > available)
> >
> > as for the slow-down, why not simply create a patch that tests this for
> you?
> >
> > create labels with [makefilename label%08d] with the input ranging from
> > 0...2000000 (or so; you'll notice when it gets slow).
> > measure the time it takes to generate the symbols (well, measure the
> > time it takes to generate 10000 symbols or)
> >
> >
> >> or is there any way to clear the symboltable ?
> > i think i covered this in another ("the" other) post quite recently.
> >
> >
> > gfsdm
> > IOhannes
> >
> >
> > _______________________________________________
> > Pd-list at lists.iem.at mailing list
> > UNSUBSCRIBE and account-management ->
> https://lists.puredata.info/listinfo/pd-list
>
>
>
> _______________________________________________
> Pd-list at lists.iem.at mailing list
> UNSUBSCRIBE and account-management ->
> https://lists.puredata.info/listinfo/pd-list
>


-- 
Maximiliano Estudies
*VDT Referat Beschallung*
+49 176 36784771
omslo.com
maxiestudies.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puredata.info/pipermail/pd-list/attachments/20201004/fff0a325/attachment.html>


More information about the Pd-list mailing list