in ``Parallel, Neural, Systolic Algorithms and Applications,''
(Michael P. Bekakos, Ed.), Advances in High Performance Computing,
Volume 5, Computational Mechanics Publications (to appear).
Ömer Egecioglu, Peter Cappello and Chris Scheiman
Introduction to Processor-Time-Optimal Systolic Arrays
Abstract.
We consider computations suitable for systolic arrays,
often called regular array computations or systems of
uniform recurrence relations. In such computations, the
tasks to be computed are viewed as the nodes of a
directed acyclic graph (dag), where the data dependencies
are represented as arcs. A processor-time-minimal schedule
measures the minimum number of processors needed to extract
the maximum parallelism from the dag. We present a
technique for finding a lower bound on the number of
processors needed to achieve a given schedule of an
algorithm represented as a dag. The application of this
technique is illustrated with a tensor product computation.
We then consider the free schedule of algorithms
for matrix product, Gaussian elimination, and transitive
closure. For each problem, we provide a time-minimal
processor schedule that meets the computed processor lower
bounds, including the one for tensor product.
omer@cs.ucsb.edu