[PD-dev] Work on "help" menu and help-patch searching spec

Mon Apr 25 15:15:50 CEST 2005

morning folks,

Just to add my -b¤ 0.02, I think Ben's idea of an integrated help-file-A
structuring and searching mechanism would be quite useful -- I'm
currently using a perl script and grep to index my own abstractions
along similar lines (topmost comment in the file together with
some rather arbitrary string matching conventions which probably
only apply for me), but a more comprehensive catalog, especially
for builtin objects and externals, would be very helpful indeed.

On 25 April 2005 at 14:05:18, IOhannes m zmoelnig wrote:
 > yes, i still think that full-object-browsing will lead to 
 > information-overdose if we don't limit ourselves to some structuring 
 > mechanism, which in turn i think will be problematic (given the 
 > different characters of pd-developers)

Agreed.  Still, I think the option of adding one's own (arbitrary)
keywords / structuring conventions is important too -- pre-defined
hard categorization schemes have a disturbing tendency to go all
goopy at the edges when they're not dictatorially enforced, which
I don't think anyone here really wants to do or to be done; maybe
the solution is just as simple as differentiating between "browsing"
and "searching"?

 > probably we should ping bryan, as he might be the one with most 
 > knowledge on auto-clustering words.

Consider me ping'd [icmp_seq=1 ttl=64 time~=3600000 ms] ;-)

As it turns out, I happen currently to be working on an unsupervised
word clustering system, although not directly along semantic lines.
Still, there are methods for generating full-blown hierarchies and/or
"flat" clusters based only on, say, word co-occurrences.  There's
even a nifty technique to find the most salient dimesions (highest
variance) in a feature data space.  Problems include (as you might
expect): unsupervised clustering doesn't necessarily get you
meaningful groupings; and perhaps more importantly from a user
standpoint, it doesn't get you meaningful cluster labels (there are
ways to work around this too, but they don't solve the basic problem);
lastly, auto-clustering is computationally very expensive -- I might
try running documentation comments through a co-occurence clustering
algorithm, but I don't see any of these techniques becoming really
useful at runtime -- at most we could use them to help induce an
initial breakdown for existing objects...

marmosets,
	Bryan