[PD] ordering arrays based in similarity

Mathieu Bouchard matju at artengine.ca
Mon Mar 5 17:31:53 CET 2012


Le 2012-03-05 à 09:09:00, William Brent a écrit :

> If you do end up treating each of your messages like a vector, the 
> [choice] extern is good for this kind of thing.  But you have to 
> implement the weighting on your own before sending anything to it.

Vector is a quite polysemic word... it's nice to note that «vector» here 
means a Hilbert space using a plain dot-product. Meaning that the concept 
doesn't have so much to do with std::vector (C++) nor java.util.Vector 
(Java).

Also one other thing to note, is that it doesn't compute the euclidean 
distance between points, which is the other easiest possible comparison.

If you have three numbers a b c that you wish to compare with x y z, the 
dot-product used in your solution is a*x+b*y+c*z, which is then divided by 
sqrt((a²+b²+c²)*(x²+y²+z²)), a scale factor that brings back the 
similarity to something between -1 and +1. In that case, +1 is the best 
match, -1 is the best backwards match (if you use negatives), and 0 is the 
most unrelated thing possible.

The euclidean distance, however, is sqrt((a-x)²+(b-y)²+(c-z)²).

The dot-product thing is like comparing angles between arrows, to find 
which arrow is in the most similar direction ; and the euclidean distance 
is like comparing distances between points, to find which is closest.

Both are extremely common and useful.

  ______________________________________________________________________
| Mathieu BOUCHARD ----- téléphone : +1.514.383.3801 ----- Montréal, QC


More information about the Pd-list mailing list