Neptune: Programming and Runtime Support for Cluster-based Network Services.

Neptune: Programming and Runtime Support for Cluster-based Network Services

University of California at Santa Barbara

[Research and System Development ] [ Download ] [Publications] [People]

This project studies infrastructural software that provides programming and runtime support for service clustering, replication, and aggregation on large server clusters with incremental scalability, 24x7 availability, and high manageability. The research issues addressed in Neptune include replication consistency, fault tolerance, load balancing, service differentiation, data aggregation, and cluster resource management.

Research and System Development

The Neptune Clustering Software: The software provides a programming API and runtime support, which allows a network service to be programmed quickly for execution on a large-scale cluster in handling high-volume user traffic. The system shields application programmers from the complexities of replication, service discovery, failure detection and recovery, load balancing, resource monitoring and management.

Neptune software has been deployed in www.ask.com and www.teoma.com since 2001 for programming and management of their giant service clusters.

Service Replication: Neptune studies the clustering of replicated network services when the persistent service data is frequently updated. It provides a flexible interface to glue and replicate existing service modules and accommodate a variety of underlying storage mechanisms. Neptune maintains dynamic and location-transparent mapping to isolate faulty service modules and enforce replica consistency. The paper in USITS'01 presents Neptune's initial overall architecture and data replication support, and illustrates its performance using three network services.

Fine-grain Service Load-balancing: Load balancing on a cluster of machines has been studied extensively in the literature, mainly focusing on coarse-grain distributed computation. Fine-grain services introduce additional challenges because system states fluctuate rapidly for those services and system performance is highly sensitive to various overhead. Through simulations and a prototype system implementation in a Linux cluster, our study concludes that: 1) Random polling based load-balancing policies are well-suited for fine-grain network services; 2) A small poll size provides sufficient information for load balancing, while an excessively large poll size may in fact degrade the performance due to polling overhead; 3) Discarding slow-responding polls can further improve system performance.

Service Differentiation: Quality of service (QoS) support that provides customized service qualities to multiple classes of client requests can effectively utilize available system resources. Our resource management framework strives to achieve four goals: 1) The framework provides a flexible mechanism for service providers to specify a variety of desired service quality metrics through QoS yield functions. 2) The framework provisions service differentiation and admission control for multiple classes of service accesses by employing both the yield functions and resource allocation guarantees. 3) The framework achieves efficient resource utilization by producing high aggregate QoS yield through greedy scheduling and speculative admission control that drops zero or low yield requests. 4) The design of this resource management framework takes into consideration service scalability and availability by deploying a decentralized two-level request distribution and scheduling scheme.

Scalable Data Aggregation: Large-scale cluster-based Internet services often host partitioned datasets to provide incremental scalability. The aggregation of results produced from multiple partitions is a fundamental building block for the delivery of these services. We design and implement a programming primitive -- Data Aggregation Call (DAC) -- to exploit partition parallelism for cluster-based Internet services in Neptune. Our architecture design aims at improving interactive responses with sustained throughput for typical cluster environments where platform heterogeneity and software/hardware failures are common. At the cluster level, our load-adaptive reduction tree construction algorithm balances processing and aggregation load across servers while exploiting partition parallelism. Inside each node, we employ an event-driven thread pool design that prevents slow nodes from adversely affecting system throughput under highly concurrent workload.

Download

Neptune source code with a sample service (version 1.0)

Selected Publications

Jingyu Zhou and Tao Yang. Selective Early Request Termination for Busy Internet Services. In 15th International World Wide Web Conference (WWW2006), Edinburgh, Scotland, May, 2006.
Jingyu Zhou, Lingkun Chu, and Tao Yang. An Efficient Topology-Adaptive Membership Protocol for Large-Scale Cluster-Based Services. In Proc. of International Parallel and Distributed Processing Symposium (IPDPS'05), Denver CO, April, 2005. Abstract, Postscript version . PDF version . Slides in Powerpoint .
Lingkun Chu, Kai Shen, Hong Tang, Tao Yang, and Jinyu Zhou. Dependency Isolation for Thread-based Multi-tier Internet Services. In Proc. of IEEE INFOCOM, Miami FL, March, 2005. Abstract, PDF version . Slides in Powerpoint .
Kai Shen, Lingkun Chu, and Tao Yang. Supporting Cluster-based Network Services on Functionally Symmetric Software Architecture. In Proc. of SC2004: High Performance Computing, Networking and Storage Conference, Pittsburgh PA, November 2004. Abstract, Postscript version . PDF version .
Kai Shen, Tao Yang, and Lingkun Chu. Clustering Support and Replication Management for Scalable Network Services. In the IEEE Transactions on Parallel and Distributed Systems - Special Issue on Middleware, Volume 14, Number 11, Pages 1168-1179, November 2003. PDF version .
Lingkun Chu, Hong Tang, Tao Yang, and Kai Shen. Optimizing data aggregation for cluster-based Internet services. In Proc. of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP), Pages 119-130, San Diego CA, June 2003. Abstract, Postscript version . PDF version . Slides .
Kai Shen, Hong Tang, Tao Yang, and Lingkun Chu, Integrated Resource Management for Cluster-based Internet Services. In Proc. of the Fifth Symposium on Operating Systems Design and Implementation (OSDI '02), Pages 225-238, Boston MA, December 2002. Abstract. Postscript version. PDF version. HTML version. Conference talk slides in Powerpoint.
Kai Shen, Tao Yang, and Lingkun Chu, Cluster Load Balancing for Fine-grain Network Services. In Proc. of International Parallel and Distributed Processing Symposium (IPDPS'02), Fort Lauderdale FL, April 2002. Abstract. Postscript version. PDF version. Conference talk slides in Powerpoint.
Kai Shen, Tao Yang, Lingkun Chu, JoAnne L. Holliday, Doug Kuschner, and Huican Zhu, Neptune: Scalable Replica Management and Programming Support for Cluster-based Network Services. In Proc. of the 3rd USENIX Symposium on Internet Technologies and Systems (USITS'01), Pages 197-208, San Francisco CA, March 2001. Abstract. Postscript version. PDF version. Conference talk slides in Powerpoint.
Huican Zhu, Hong Tang and Tao Yang, Demand-driven Service Differentiation for Cluster-based Network Servers . Postscript version. in Proc. of IEEE INFOCOM'2001.

Huican Zhu and Ben Smith and Tao Yang, Scheduling Optimization for Resource-Intensive Web Requests on Server Clusters. the Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA'99). Slides:Postscript version.. PDF version

PhD Dissertations

Kai Shen, Clustering, Resource Management, and Replication Support for Scalable Network Services. PhD dissertation, Dept. of Computer Science, UCSB, 2002.
Huican Zhu, Resource Management for Scalable Web Services. PhD dissertation, Dept. of Computer Science, UCSB, 2000.

Slides on the use of Neptune for query processing in search engines

Faculty

Students

Lingkun Chu
Hong Tang
Jingyu Zhou
Aziz Gulbeden

Past Members

Huican Zhu (Google Inc.)
JoAnne L. Holliday (Santa Clara University)
Doug Kuschner

Support

This material is based upon work supported by the National Science Foundation under Grant No. EIA-0080134, CCR-9702640, ACIR-0082666, and 0086061. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

In The News:

Other

Our research on web caching and service differentiation.