Technical Report TRCS98-37, Computer Science Department, University of California, Santa
Barbara, December 1998.
Ömer Egecioglu
How to Approximate the Inner-product: Fast Dynamic Algorithms for Euclidean Similarity
Abstract.
We develop dynamic dimensionality reduction based on the approximation of the standard
inner-product.
This results in a family of fast algorithms for checking similarity of objects whose
feature representations are large
dimensional real vectors, a common situtiton in various multimedia databases. The method
uses the power symmetric
functions of the components of the vectors, which are powers of the p-norms of the vectors
for p = 1, 2,.., m. The
number m of such norms used is a parameter of the algorithm whose simplest instance gives
a first-order
approximation implied by the Cauchy-Schwarz inequality. We show how to compute fixed
coefficients that work as
universal weights based on the moments of the probability density function assumed for the
distribution of the
components of the input vectors in the data set. If the distribution of the components of
the vectors is not known we
show how the method can be adapted to work dynamically by incremental adjustment of the
parameters.
Keywords: Distance approximation, inner-product, similarity search, dimensionality
reduction, least-squares.
omer@cs.ucsb.edu