Parallel Algorithms and Applications, Vol. I (1993), pp. 191-209.
Ömer Egecioglu and Ashok Srinivasan
Optimal Parallel Prefix on Mesh Architectures
Abstract.
Algorithms for efficient implementation of computation of prefix products on
mesh-connected processor arrays are presented. Assuming that an arithmetic
operation takes unit time and communication/computation ratio for a single
input item is $\tau$, we show that the prefixes of $n$ items can be computed
in time $2\tau\sqrt{n} + O(\log n)$ on a square mesh with $n$ processors.
If $n$ processors are configured as a disc with respect to the Manhattan metric,
then the parallel time for the problem becomes
$\sqrt{2}\tau \sqrt{n} + O(\sqrt{\tau}\sqrt[4]{n})$. We show that both
of these algorithms are asymptotically optimal.
omer@cs.ucsb.edu