[PD] ordering arrays based in similarity
Mathieu Bouchard
matju at artengine.ca
Mon Mar 5 17:31:53 CET 2012
Le 2012-03-05 à 09:09:00, William Brent a écrit :
> If you do end up treating each of your messages like a vector, the
> [choice] extern is good for this kind of thing. But you have to
> implement the weighting on your own before sending anything to it.
Vector is a quite polysemic word... it's nice to note that «vector» here
means a Hilbert space using a plain dot-product. Meaning that the concept
doesn't have so much to do with std::vector (C++) nor java.util.Vector
(Java).
Also one other thing to note, is that it doesn't compute the euclidean
distance between points, which is the other easiest possible comparison.
If you have three numbers a b c that you wish to compare with x y z, the
dot-product used in your solution is a*x+b*y+c*z, which is then divided by
sqrt((a²+b²+c²)*(x²+y²+z²)), a scale factor that brings back the
similarity to something between -1 and +1. In that case, +1 is the best
match, -1 is the best backwards match (if you use negatives), and 0 is the
most unrelated thing possible.
The euclidean distance, however, is sqrt((a-x)²+(b-y)²+(c-z)²).
The dot-product thing is like comparing angles between arrows, to find
which arrow is in the most similar direction ; and the euclidean distance
is like comparing distances between points, to find which is closest.
Both are extremely common and useful.
______________________________________________________________________
| Mathieu BOUCHARD ----- téléphone : +1.514.383.3801 ----- Montréal, QC
More information about the Pd-list
mailing list