Open softwares maintained by PRML lab
Fast PC k-NN algorithm
We developed a probably correct k-nearest neighbor search algorithm that works even for very large dimensionality. It receives a torelance error and outputs a correct answer with an error below the specified error. The key idea is very simple, if a pair is close enough to each other in the original high-dimensional space, then they must be close even in a few dimensions. It is clearly true if we have a sufficient number of samples. However, we have to be careful that the order cannot be presereved anymore, that is, 1-NN in the original space cannot stay as 1-NN in the projection space. So, we evaluate in probability the possible distance of 1-NN’s in the selected projection space of a small number of (PCA) features. For a given torelance error, we narrow the search area in the projection space so as to keep the possiblity of outreach the area is less than the specified value. For the detail, see reference [1]
[1] J. Toyama, M. Kudo and H. Imai, “Probably Correct k-Nearest Neighbor Search in High Dimensions”, Pattern Recognition, in press, (DOI information: 10.1016/j.patcog.2009.09.026).
- kNN_MDS(Ver1.3):mds_rb1.0.tgz