![]()
Sandstorm Tutorial and User's Guide
Matt Welsh,
mdw@cs.berkeley.edu |
This is an installation and user's guide for Sandstorm, a Java-based platform for highly-concurrent server applications. Sandstorm is being built as part of my Ph.D. thesis work. See the Sandstorm Web pages for background and publications about the system.
Sandstorm's goal is to support server applications which demand massive concurrency (on the order of tens of thousands of simultaneous client connections). It is based on a design called the staged event-driven architecture (SEDA), which decomposes a complex, event-driven application into a set of stages connected by queues. This design avoids the high overhead associated with thread-based concurrency models, and decouples event and thread scheduling from application logic. SEDA enables services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. Decomposing services into a set of stages also enables modularity and code reuse, as well as the development of debugging tools for complex event-driven applications.
Sandstorm is implemented in Java and has been tested primarily on Linux 2.2.x systems running the IBM JDK 1.1.8 and 1.3. Sandstorm should run on any Java platform which supports native threads and the NBIO library (which provides nonblocking I/O primitives). Currently, Linux and Solaris are known to work. Windows NT/2000 support is forthcoming.
Sandstorm requires NBIO to enable nonblocking, event-driven I/O in Java. NBIO requires native threads --- not "green threads" --- because of interactions between nonblocking I/O and the JVM's thread management. When using Sandstorm and NBIO, you should be able to write applications with a very small number of threads, so the scalability limits of some native threads packages (e.g., under Linux) should not be a problem.
If you are a member of the Berkeley Ninja project, NBIO is already available to you as the package mdw.nbio. Otherwise, NBIO is packaged with Sandstorm releases and is found in the nbio subdirectory of the distribution.
The use of nonblocking I/O is vital for obtaining good performance and high concurrency. Sandstorm does not support "threadpool" emulations of nonblocking socket I/O, and I have no plans to do so. I have demonstrated that emulation of nonblocking I/O using threads does not scale beyond a few hundred socket connections. Sandstorm is designed for tens of thousands of connections; if you don't need this degree of concurrency then you probably don't need to be using Sandstorm. If NBIO is not yet supported on your operating system, it is much better to port NBIO rather than build and maintain threadpool emulation. The Sandstorm disk layer currently uses a threadpool model, because nonblocking file I/O is not supported by most operating systems.
Sandstorm is in the package mdw.sandStorm, so the directory containing the mdw directory needs to be on your CLASSPATH. To use the NBIO native libraries you need to place the directory mdw/lib directory on your LD_LIBRARY_PATH. Finally, place mdw/sandStorm/bin on your PATH; this directory contains some scripts which are used for running Sandstorm applications.
To compile NBIO, Sandstorm, and related utility code, simply cd to the toplevel mdw directory and type make. If there are any problems with compilation, be absolutely sure that the directory containing mdw is in fact on your CLASSPATH. This is the most common problem when building the code.
You can now test the installation by changing to the sandStorm/test/basic directory and running:
sandstorm sandstorm.cfgYou should see output resembling the following:
Sandstorm v1.4 <mdw@cs.berkeley.edu> Starting at Fri May 04 11:17:26 PDT 2001 Sandstorm: Starting aSocket layer SelectSet: Using poll(2) SelectSource created, do_balance = true TP <aSocket ReadStage>: Adding 1 threads to pool, count 1 TP <aSocket ReadStage>: Starting 1 threads SelectSource created, do_balance = true TP <aSocket ListenStage>: Adding 1 threads to pool, count 1 TP <aSocket ListenStage>: Starting 1 threads SelectSource created, do_balance = true TP <aSocket WriteStage>: Adding 1 threads to pool, count 1 TP <aSocket WriteStage>: Starting 1 threads Sandstorm: Starting aDisk layer TP <AFileTPTM Stage>: Adding 1 threads to pool, count 1 TP <AFileTPTM Stage>: Starting 1 threads Sandstorm: Loaded TimerStage from TimerHandler Sandstorm: Loaded SinkStage from DevNullHandler Sandstorm: Loaded GenStage1 from GenericHandler Sandstorm: Initializing stages -- Initializing <TimerStage> delay=500 TP <TimerStage>: Adding 1 threads to pool, count 1 TP <TimerStage>: Starting 1 threads -- Initializing <SinkStage> Started TP <SinkStage>: Adding 1 threads to pool, count 1 TP <SinkStage>: Starting 1 threads -- Initializing <GenStage1> sleep=100 TP <GenStage1>: Adding 1 threads to pool, count 1 TP <GenStage1>: Starting 1 threads Sandstorm: Waiting for all components to start... TimerHandler: GOT QEL: mdw.sandStorm.core.BufferElement@807502e GenericHandler: GOT QEL: mdw.sandStorm.core.BufferElement@807502e Sandstorm: Ready. DevNullHandler: GOT QEL: mdw.sandStorm.core.BufferElement@807502e TimerHandler: GOT QEL: mdw.sandStorm.core.BufferElement@807502e GenericHandler: GOT QEL: mdw.sandStorm.core.BufferElement@807502e DevNullHandler: GOT QEL: mdw.sandStorm.core.BufferElement@807502e |
If any exceptions are thrown or other error messages are displayed, then something is wrong; you should try to track these down yourself, or contact me.
The sandstorm script uses the following syntax:
sandstorm [-profile] <configfile> [initargs ...]where configfile is the name of the Sandstorm configuration file (described in detail below). Use of the -profile option turns on the Sandstorm profiler (also described below), and overrides the profile option (if any) in the configuration file. initargs is a list of name=value pairs which overrides any initialization argument settings in the configuration file; see below.
This section discusses the basic programming interfaces used by Sandstorm applications. Consult the Javadoc API documentation for Sandstorm for more details on the specific APIs. (This documentation is generated as part of the build process for Sandstorm; if you have not yet compiled the code, you can get it from the Web at this link.) Most of the application-level APIs are in the package mdw.sandStorm.api.
Stages and Event Handlers
Sandstorm applications are constructed as a set of stages connected by event queues. Threads are used to drive stage execution. However, the programming model hides the details of stages, queues, and threads from the application. The only component that the application implements is a set of event handlers. An event handler implements the core event-handling logic of a stage. Therefore an event handler is the application-level code which is "embedded" in a stage.
An event handler (represented by the interface EventHandlerIF) implements the following set of methods:
public void handleEvent(QueueElementIF elem) throws EventHandlerException; public void handleEvents(QueueElementIF elemarr[]) throws EventHandlerException; public void init(ConfigDataIF config) throws Exception; public void destroy() throws Exception;
A QueueElementIF is the basic representation of an event; it is an empty interface which the application provides implementations of to represent different events in the system.
handleEvent and handleEvents are the basic event-handling code. These two methods are invoked by the runtime when events are pending for a stage. init and destroy are used for initialization and cleanup, respectively.
Note that event handlers do not directly manage threads or consume events from their incoming event queue. This is a deliberate design decision, which decouples the details of thread and queue management from the application logic. The runtime is responsible for allocating and scheduling threads, performing queue operations, and so forth.
Event queues, SinkIF, and StageIF
Each stage has one or more associated event queues, but only one event handler which processes events from those queues. As stated above, an event handler does not directly dequeue events from its incoming event queues; rather, the runtime system manages this and invokes the handler's handleEvent or handleEvents method with the pending events.
An event queue is represented by two ends -- a "source" and a "sink". The source allows dequeue operations and is only visible to the Sandstorm internals. The sink allows enqueue operations, and is visible to event handlers which wish to deliver events to a given stage. The event sink is represented by the class SinkIF, which supports various enqueue operations: enqueue, enqueue_many, and enqueue_lossy The first two methods allow one or more events to be pushed onto the queue. These methods may throw an exception if the sink is full, or if it is no longer being serviced (i.e. the sink is "closed"). enqueue_lossy allows events to be enqueued, but does not throw an exception; rather, the events are dropped if the sink is full or closed. SinkIF also supports operations which return the number of pending events on the queue, as well as a "split-phase" enqueue operation. See the Javadocs for details.
An event handler communicates with another stage by obtaining a handle to that stage's StageIF. This is accomplished through the ManagerIF interface, which is passed in as part of the ConfigDataIF to the event handler's init method. The StageIF allows an event handler to get a handle to that stage's SinkIF. Note that a stage may have more than one SinkIF associated with it; this is to support differentiated event queues of various kinds. For the most part applications make use of the "main" SinkIF associated with a stage.
Event Handler Example
Here is the code for a very simple event handler which prints a message for every received event, and passes the event along to another stage:
import mdw.sandStorm.api.*; public class printEventHandler implements EventHandlerIF { private SinkIF nextStageSink; public void init(ConfigDataIF config) throws Exception { // Get system manager ManagerIF mgr = config.getManager(); // Get stage with name "nextStage" StageIF nextStage = mgr.getStage("nextStage"); // Get nextStage's event sink nextStageSink = nextStage.getSink(); System.err.println("printEventHandler initialized"); } public void destroy() { System.err.println("printEventHandler shutting down"); } public void handleEvent(QueueElementIF elem) throws EventHandlerException { System.err.println("printEventHandler: Got event "+elem); // Pass the event along nextStageSink.enqueue(elem); } public void handleEvents(QueueElementIF elemarr[]) throws EventHandlerException { // Just call handleEvent in FIFO order for (int i = 0; i < elemarr.length; i++) { handleEvent(elemarr[i]); } } } |
The package mdw.sandStorm.lib.aSocket provides an asynchronous socket communication facility for Sandstorm applications. This is very similar to the original ninja2.core.io_core.aSocket API, if you are familiar with that. You should use the Sandstorm version of this library, not the original "io_core" version (the former is integrated into Sandstorm's runtime system and has better performance). This library currently supports TCP (stream-based) sockets only; support for UDP is forthcoming.
An application creates an outgoing socket connection by creating an ATcpClientSocket, providing the hostname, port number, and SinkIF for the stage requesting the connection. When the connection is established a ATcpConnection object is pushed to the given SinkIF. To send new packets over the connection, the application enqueues a BufferElement to the ATcpConnection object. (A BufferElement is just a wrapper for a byte array.) When new packets are received, an ATcpInPacket object is pushed to the application; this contains a BufferElement with the contents of the packet, as well as a pointer to the ATcpConnection from which the packet was received.
Server sockets are created by instantiating an ATcpServerSocket, which pushes incoming connections (as ATcpConnections objects) to the application.
The package mdw.sandStorm.lib.aDisk provides an asynchronous file library. The primary way to access this mechanism is to create an object of type AFile; you can enqueue I/O requests to the file through its read, write, and seek methods. When the I/O operations complete, AFileIOCompleted events are pushed to the client's SinkIF.
The current implementation makes use of blocking file I/O and a small thread pool; however, we are currently working on a true asynchronous implementation using the POSIX.4 AIO mechanism.
See the Javadoc documentation for this library for more detail.
An application can request that events be delivered to a SinkIF at some time in the future using the mdw.sandStorm.core.ssTimer library. This class supports registerEvent and cancelEvent calls and is fairly self-explanatory.
Other useful libraries are provided in the sandStorm/lib directory. The mdw.sandStorm.lib.http package provides an asynchronous HTTP protocol implementation based on aSocket. Likewise, mdw.sandStorm.lib.Gnutella provides a Gnutella protocol implementation. In the test directory within these two packages you will find simple HTTP and Gnutella servers implemented using these libraries; the Javadoc documentation also makes it clear how to use them.
The Sandstorm configuration file (usually called sandstorm.cfg) uses an XML-like format, consisting of nested sections delimited by tags of the form
<sectionName> ... </sectionName>The exact contents of this file are bound to change in future Sandstorm releases, but the basic format should remain the same.
The file test/basic/sandstorm.cfg provides an example Sandstorm configuration file. Note that this file contains the complete set of options in order to document their use, but in general a config file should only contain those options which you need to specify. A simple configuration file is shown below:
# Example Sandstorm configuration file <sandstorm> <stages> <firstStage> class mdw.test.firstStageHandler <initargs> somearg1 somevalue1 somearg2 somevalue2 </initargs> </firstStage> <secondStage> class mdw.test.secondStageHandler </secondStage> </stages> <!include options.cfg> </sandstorm> |
Comments are started by # characters and extend to the end of the line. Files may be included (nested includes are supported) using the <!include filename> directive.
The main section of the file must be named <sandstorm>, and consists of a <stages> which contains one subsection per stage. There is also a <global> section used for defining global options; see below. The contents of each section consist of key-value pairs of the format
key1 value1 key2 value2 ... keyN valueNwhere key is a single (whitespace-delimted) word and value is an arbitrary string which may contain whitespace. The value field extends to the end of the line.
<stage> sections
Each subsection of the <stage> section is formatted as follows:
The initialization arguments for each stage are derived from three different sources. In order of precedence, they are:
<global> section
The config file may also contain a <global> section which defines global options. This section is optional and in general should not be used by applications; the default values should be acceptable. If you wish to use it, however, here is the format:
The programmatic interface for configuring the Sandstorm system is given by the class mdw.sandStorm.main.SandstormConfig. This class allows the system to manipulate arbitrary key/value pairs (where both key and value are specified as strings, but may be converted to and from booleans, ints, and doubles as well). See the source code for this class to understand how configuration options are specifed and used by the runtime.
Command-line arguments
Any setting in the configuration file can be overridden using command-line arguments when you run sandstorm. These are of the form name=value and are specified on the command line after the configuration file name. For example,
sandstorm myconfiguration.cfg global.threadPool.minThreads=2sets the global.threadPool.minThreads parameter to 2, regardless of its setting in the configuration file.
If you specify a command-line argument name that does not contain a dot (.), it will be passed as an initialization argument to every stage, as if it were specified in the <initargs> subsection of every <stage> section. For example,
sandstorm myconfiguration.cfg foo=barsets the initialization parameter named foo to the value bar for every stage; stages can retrieve this value by calling ConfigDataIF.getString().
The Sandstorm profiler is extremely valuable for understanding the performance of applications, and for identifiying bottlenecks. When enabled (by setting profile true in the <global> section of the configuration file, or by using the -profile option on the sandstorm command line), the profiler samples queue lengths and the Java heap size every 100 ms and appends a record to the file sandstorm-profile.txt in the current directory. (These values can be changed in the configuration file.)
If you have Gnuplot installed, you can visualize the Sandstorm profiler output by running the script ssprofile-graph in the Sandstorm bin directory; this script takes a single argument which is the name of the profile logfile. ssprofile-graph draws a graph of the queue lengths and heap size over time. You can edit this script to customize the output if you like. Here is an example of the display produced by ssprofile-graph:
At runtime you can get a handle to the system profiler using the ManagerIF.getProfiler() method. ProfilerIF defines the profiler API. The add() method allows you to add an object to the profiler's queue-length trace; anything which implements ProfilableIF can be profiled in this way.,
The addGraphEdge() and dumpGraph() methods allow the profiler to generate a graph depicting the connectivity between stages. Calling addGraphEdge() adds a pair of nodes to the graph with a directed edge between them. dumpGraph dumps the graph to an output file which you can visualize using the graphiviz tool from AT&T Research. Here is an example graph generated using this approach:
This graph has been beautified somewhat by hand, but you get the idea. Soon I will be packaging up the tools to make this process somewhat more automatic.
Sandstorm is designed to be embedded into other applications. In this way you can create a Sandstorm instance within some other application and control it programmatically, creating and destroying stages, and so forth.
To do this, simply create an object of type mdw.sandStorm.main.Sandstorm. In general you should only create one Sandstorm per Java Virtual Machine, since multiple instances may interfere with one another in terms of sockets, thread scheduling, and so forth.
There are three constructors for the Sandstorm class:
The Sandstorm class exports two other methods which you can use to control the Sandstorm instance from your application:
A great deal of programming in Sandstorm is a matter of style. I have tried to design the APIs to encourage a particular way of doing things, and to discourage "abuses" of Java functionality which can lead to poor performance or badly-structured applications. Still, I think it's a good idea to provide a list of "do's and don'ts" about programming in Sandstorm.
Programming in Sandstorm requires a fundamental shift in thought -- you are no longer using a single thread to process a single request through the system; rather, you are breaking the request into multiple stages through which events flow. Anything you can do with threads you can do with this model, although it may seem a bit strange at first.
The problem with applications managing their own threads is that they will inevitably interfere with the thread allocation and scheduling policies performed by the Sandstorm runtime system. If you are unhappy with the performance or behavior of Sandstorm's two standard thread managers, then consider implementing your own. The appropriate interface is mdw.sandStorm.api.ThreadManagerIF and there are two implementations in the internal directory which you can use as a guide. In general I do not recommend taking this approach, but it is an option.
I strongly recommend that stages pass data by value rather than by reference. It is fine for an event passed between stages to contain a pointer to an object, as long as the pointer is not used concurrently by more than one stage. If you can adopt a "pass by value" programming style then you will avoid these problems.
Sandstorm provides nonblocking, asynchronous network I/O in the form of the aSocket library. For TCP sockets you should never have to resort to using the original blocking Socket and ServerSocket APIs from the JDK. Soon I will add support for UDP sockets. Asynchronous disk access is also forthcoming; in the meantime you may have to resort to using blocking file access, or use the (possibly outdated) ninja2.core.io_core.fs_disk API, which provides a threadpool emulation of nonblocking file I/O. I have not experimented with this latter option so your mileage may vary.
Note that the thread pool controller automatically allocate new threads (up to some limit) to a stage which appears to be saturated. The idea here is that if a stage is sleeping for some reason then the system automatically doles out new threads to it. This can be used to avoid bottlenecks due to sleeping, but only up to a certain point (that is, only until we allocate so many threads that performance starts to suffer). This feature is still experimental and not meant for production use; it is much better to avoid blocking than to rely on this mechanism.
Many of the classes making up the Sandstorm implementation support a simple debugging feature: at the top of the class there is a definition of the form
private static final boolean DEBUG = false;By changing the value of DEBUG to true, and recompiling the corresponding class, that class will emit a lot of debugging information to stderr at runtime. This is useful if you suspect a bug inside of Sandstorm somewhere (although I am sure it is 100% bug-free :-)). The same can be done for the NBIO library. Be sure to use the Sandstorm profiler as well, since this can be used to find a number bugs including performance bottlenecks and memory leaks.
If you do have a problem with Sandstorm please don't hesitate to get in touch with me. It would be helpful if you can isolate the problem you are seeing to a simple reproducable case.