Faculty Research Seminar
Winter 2000
Description:
This seminar is an introduction to research for graduate students.
For the first two weeks we will present talks by established researchers
on how to conduct research. After that, every week,
one of the faculty members will talk about his
research. This is a great way to get information about the
ongoing research projects in the department.
Course number: 595J
Enrollment Code: 68932
Time: Fridays 1-2pm
Place: CS conference room
This week's talk
Friday, March 16th, at 1PM:
``Large and Long-Lived Parallel Computation using Java on the Internet''
by Peter Cappello
The research concerns a Java-based infrastructure intended to harness the
Internet's
vast, growing, computational capacity for ultra-large, coarse-grained,
parallel applications.
The purpose of this research is to:
1) transform large heterogeneous computer networks, even the Internet
itself,
into a monolithic, multi-user, always-available multiprocessor;
2) solve some world record size computational problems;
3) via a simple API, allow designers to focus on a recursive
decomposition/composition of the parallelizable part of the computation.
Summarily, the application programmer will get the performance benefits
of massive parallelism without the typically attendant costs:
adulterating the application logic with
an interprocessor communication protocol,
topology-specific (e.g., hypercube) interprocessor communication, and
fault tolerance schemes.
The research includes implementing several widely applicable algorithms,
and deploy parallel implementations on well
over a thousand processors that are somewhat geographically dispersed.
This sets the stage for using tens of thousands of processors.
These computations include the optimization version of some NP-hard
problems
(e.g., traveling salesman problem and integer linear programming)
and several scientific computations
(e.g., the conjugate gradient method for solving linear systems
iteratively
and the N-body problem).
Perhaps most challenging, and revealing of the archtecture's limits,
is the N-body problem.
An application is appropriate
if an execution time of a few minutes, say 10, is acceptable.
For example, a branch-and-bound problem that takes 100,000 minutes
(more than 2 months) on one processor,
should be solved in 10 minutes on 10,000 Internetworked processors.
However, if a problem takes 10,000 seconds on a single processor,
it cannot be solved in 1 second with 10,000 processors;
Internet latencies preclude parallelism that is that fine-grained.
Thus, virtual reality applications, for example,
would not be appropriate for this architecture.
The software for hosting or brokering such computations
will be downloadable from the a web site, as will
the software for developing & deploying applications
on the host/broker network.
The web site will contain tutorials, demonstrations,
and a repository for users to share their work.
A mail list will facilitate communication among its user
community.
Network statistics
(e.g., how many hosts are available and
the average amount of time a host is available) will be gathered,
aggregated, and displayed in quasi-real time from a web interface.
Another visualization tool will gather/display
interprocessor communication for actual computations.
Seeing these communications "in action"
will give insight into the application's decomposition/composition
process,
and visually reveal the communication patterns associated with the task
scheduling
and fault tolerance mechanisms.
The use of multicast and JavaSpaces will be investigated in connection
with these research goals.
Previous Talks