Int. J. of Foundations of Computer Sci. 10 (1), 1999, pp. 81-102.
Ömer Egecioglu, Amr El Abbadi, and Kamil Sarac
DFT Techniques for Size Estimation of Database Join Operations
Abstract.
Novel algorithms based on the Discrete Fourier Transform (DFT)
are proposed to estimate
the size of relations resulting from join operations.
We start with an approach in which the frequency distribution
values are transformed using the DFT
and the Fourier coefficients are used to
construct histograms.
Our main contribution is a direct approach which uses
the amplitudes of the DFT coefficients iteratively.
The proposed algorithm gives
the exact join size using logarithmic space
for the special case of self join. A generalization to compute
the join of arbitrary relations is
then used to develop two tree-based techniques that
provide a spectrum
of algorithms which interpolate storage requirements versus accuracy of
the estimation obtained.
Finally, we present experimental results to exhibit the
effectiveness of our approach.
omer@cs.ucsb.edu