Proc. 11-th Int. Conf. on Parallel and Distributed Computing and Systems
(PDCS'99), Nov. 1999, Boston, pp. 194-200.
Ömer Egecioglu and Hakan Ferhatosmanoglu
Circular Data-Space Partitioning for
Similarity Queries and Parallel Disk Allocation
Abstract.
In a multiple disk environment it is desirable to
have techniques for efficient parallel execution of similarity queries.
Usually many buckets that may have the query result are needed to
be retrieved from secondary storage, which is a costly operation.
To achieve efficiency,
there are two major factors that need to be considered. These are
the number of buckets retrieved by the query, and the
degree of parallelism provided by the disk allocation method.
In this paper, we develop efficient techniques for parallel similarity
searching by optimizing these two factors defined for
data-sets that are circular in nature, and similarity queries consisting of
query spheres centered at the query point.
Our partitioning technique
minimizes the expected number of buckets retrieved by a random query
among a spectrum of partitioning schemes which have
equi-area concentric rings and equi-area central wedges as its two extremes.
A simple disk allocation technique for the proposed partitioning
method that maximizes
the degree of parallelism obtained is also described.
omer@cs.ucsb.edu