This is the mail archive of the java-discuss@sourceware.cygnus.com mailing list for the GCJ project. See the GCJ home page for more information.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Experiences using GCJ as an embedded compiler


     Instead, I propose that GCJ layout vtables for each interface and, at
     compile time, determine the interface and vtable index being invoked.
     Thus, the above run-time interface would change to:

As Tom mentioned, I do have a design for interface calls that would be
constant-time, and does not require any searcing.  This means means the
lookup could be inlined (unless optimizing for space).  It does
require a little calculation at class load/link time.  If you feel
inspired to implement it, that would be great.  As Tom also mentioned,
we do need copyright assignments before we can accept any non-trivial
code.

The gcj routine libraries will also need to be changed;  we are
working on getting the Net release ready, and it should happen very soon.

This is the first public discussion of the idea, but I wrote it down
about a year ago.

This is a design for how to do class and interface membership tests
(i.e. dynamic casts and the instanceof operator), as well as
virtual and interface method calls in Java, all in contant time,
and a single vtable/class pointer per object, and a reasonable
amount to per-class space.

Java only has single inheritance, but has a restricted form of
multiple inheritance using interfaces.  An interface is basically
an abstract class with only virtual methods, no instance fields,
and is only used like a C++ virtual base class.  These restrictions
cover many idioms of multiple inheritance, while being much easier
to implement.  I haven't thought much whether it might make sense
to use a similar implementation scheme for C++ field-less virtual
abstract base classes.  The big advantage compared to the standard
C++ techniques is that you don't need any per-object vbase pointers,
or extra per-object vtables, and hence no object offsets.

I first explain how to do it for just class membership testing and
simple virtual function calls.  This is well-known in the literature,
but helps set the stage.  The real discussion is implementing
membership tests for interfaces, and interface methods calls.

/*** CLASS MEMBERHSIP TESTING ***/

/* The depth of a class is how many "extends" it is removed from Object.
   Thus the depth of java.lang.Object is 0, but the depth of
   java.io.FilterOutputStream is 2.
   Depth is defined for regular classes and array classes,
   but not primitive types or interfaces. */
#define CLASS_DEPTH(CL) ((CL)->ui.cls.depth)

// We can do class member testing in constant time adding in each class
// using a small extra table of all the ancestor classes.
// (This trick is relatively old.)
// Note also we do not need to include the java.lang.Object in the table
// of ancestors, since it is redundant.
// If we order this table from current classes down by decreasing depth,
// we can combine the ancestors table with the pointer to the actual classs.

typedef void *methodptr;

struct _dispatchTable {
  Class ancestors[CLASS_DEPTH(THIS_CLASS)-1];
  methodptr method[...];
};

#define OBJECT_CLASS(OBJ) ((OBJ)->dtable->ancestors[0])

#define OBJECT_INSTANCEOF_CLASS(OBJ, CLS) \
  CLASS_SUBCLASS_OF(OBJECT_CLASS(OBJ), CLS)
#define CLASS_SUBCLASS_OF(OCLS, CLS) \
  ((CLS) == &ObjectClass \
   || ((CLASS_DEPTH(OCLS) - CLASS_DEPTH(CLS) >= 0 \
       && (OCLS)->dtable->ancestors[CLASS_DEPTH(OCLS) - CLASS_DEPTH(CLS)] \
           == (CLS))))

// Notice that the two instances CLASS_DEPTH(OCLS) - CLASS_DEPTH(CLS) can be
// combined to a single subtraction.
// Note that CLS is normally constant, such the first test against ObjectClass
// can normally be eliminated at compile-time.

If we assume that CLASS_DEPTH is seldom more than 4, we can make an
optimization, at the cost of a few extra words per class:

struct _dispatchTable {
  Class clas;
  // If the CLASS_DEPTH is less than 4, allocate 4 words anyway.
  // The unused ones are NULLs.
  Class ancestors[MIN(CLASS_DEPTH(THIS_CLASS)-1, 4)];
  methodptr method[...];
};

#define OBJECT_CLASS(OBJ) ((OBJ)->dtable->clas)
#define CLASS_SUBCLASS_OF_FAST(OCLS, CLS) \
  (OCLS)->dtable->ancestors[CLASS_DEPTH(CLS)-1] == (CLS))))
#define CLASS_SUBCLASS_OF(OCLS, CLS) \
  ((CLS) == &ObjectClass \
   || (CLASS_DEPTH(CLS) <= 4 ?  CLASS_SUBCLASS_OF_FAST(OCLS, CLS) \
     : (CLASS_DEPTH(OCLS) >= CLASS_DEPTH(CLS) \
        && (OCLS)->dtable->ancestors[CLASS_DEPTH(CLS)-1] == (CLS))))

// This wins, because we normally know (at compile-time) the identify
// of CLS and its CLASS_DEPTH.  Hence CLASS_SUBCLASS_OF reduces (at
// compile-time) to  CLASS_SUBCLASS_OF_FAST.

// Reference:  N. Wirth:  Reply to "type-extension type tests can be
// performed in constant time".  ACM TOPLAS 13(4):630, 1991.

/*** INTERFACES TESTS AND CALLS ***/

// Doing (x instanceof I) and calls like (((I)x).method(...)) are more
// difficult.  This is because interfaces may be "multiple inherited".
// Looked an tests can be done in constant time with single inheritance,
// because we can associate a unique depth to each class and a unique
// method index to each virtual index, and in a way that the tables are
// compact.  This is more difficult with multiple inheritance.

// Here I propose a design which I think is fairly efficient
// in both time (constant-time) and space.

// We associate an index with each method declared in an interface.
// For each interface, we ignore any inherited interfaces, sort the methods
// by some well-defined citerium (which could be declaration order),
// and assign indexes starting with 1 (zero is used for type tests).
// We call this index the Interface Methods Index of the interface method.

// Each superinterface of a class (i.e. each interface that the class
// directly or indirectly implements) has a corresponding "Partial
// Interface Dispatch Table" whose size is (number of methods + 1) words.
// The first word is a pointer to the interface (i.e. the java.lang.Class
// instance for that interface).  The remaining words are pointers to the
// actual methods that implement the methods declared in the interface,
// in the order specified by the Interface Method Index of the previous
// peragraph.

// For each non-abstract class, we create a list of its superinterfaces
// These are sorted by some criterium (alphabetical seems as good as any).
// We then create the "Interface Dispatch Table" by concatenating all the
// Partial Interface Dispatch Tables for each superinterface, in the
// order given by the sort.

// The problem now is to find the correct offset in the Interface
// Dispatch Table so we select the correct Partial Interface Dispatch Table
// that we are intersted in.  Once we have that, invoking an interface
// method just requires indexing with the Interface Method Index (known at
// compile time) to get the correct method.  Doing a type test (cast or
// instanceof) is the same problem: Once we have a possible Partial
// Interface Dispatch Table, we just compare the first element to see if
// it matches the desired interface.

// So how can we calculate the correct offset?  Unfortunateky, we can't
// do it at compile time.  It depends on both the interface, and the
// actual class of the receiver.  Our solution is to keep a vector of
// candiate offsets in each interface (the ioffsets table), and
// in each class we have an index (the iindex) used to select the
// correct offset from ioffsets.

struct Class {
  ...;
  union {
    struct {
      /* Index into interface's imap. */
      short iindex;
      short depth;
      /* pointer to vector of pointer to dispatch tables. */
      methodptr **itable;
    } cls;
    struct {
      /* Index into actual class's itable. */
      short *imap;
    } iface;
  } ui;

};

#define CLASS_IINDEX(CL) ((CL)->ui.cls.iindex)
#define CLASS_ITABLE(CL) ((CL)->ui.cls.itable)
#define CLASS_IMAP(CL) ((CL)->ui.iface.imap)

#define LOOKUP_INTERFACE(CL, IFACE, IMETODINDEX) \
  (CLASS_ITABLE(CL)[CLASS_IMAP(IFACE)[CLASS_IINDEX(cl)]][IMETHODINDEX])

struct Class {
  ...;
  union {
    struct {
      /* Index into interface's imap. */
      short iindex;
      short itable_length;
      /* Pointer to Interface Dispatch Table. */
      methodptr *itable;
    } cls;
    struct {
      /* Offsets into actual class's itable. */
      /* First element is the length. */
      /* Negative values are unused. */
      short *ioffsets;
    } iface;
  } ui;

};

#define CLASS_IINDEX(CL) ((CL)->ui.cls.iindex)
#define CLASS_ITABLE(CL) ((CL)->ui.cls.itable)
#define CLASS_ITABLE_LENGTH(CL) ((CL)->ui.cls.itable_length)
#define CLASS_IOFFSETS(CL) ((CL)->ui.iface.ioffsets)
#define CLASS_IOFFSETS_LENGTH(CL) ((CL)->ui.iface.ioffsets[0])

The CLASS_ITABLE can be computed at compile-time, and statically
allocated.  The CLASS_IINDEX of a class, and the CLASS_IOFFSETS
of an interface, have to be computed at link time.  In the case
of dynamic loading, this means run-time.  As more classes are
loaded, CLASS_IINDEX does not change, but the CLASS_IOFFSETS
vector of an interface may need to get extended if new classes
are loaded.  The function find_iiface does this calculation.

#define LOOKUP_INTERFACE(CL, IFACE, IMETODINDEX) \
  (CLASS_ITABLE(CL)[CLASS_IOFFSETS(IFACE)[CLASS_IINDEX(CL)] + IMETHODINDEX])

static short null_offset[2] = { 1, 0 };

/* Calculate and return the CLASS_IINDEX for a new class.
 * IFACES is a vector of NUM interfaces that the class implements.
 * OFFSETS[J] (for 0<=J<NUM) is the offset in the Interface Dispatch Table
 * for the Partial Interface Dispatch table corresponding to IFACE[J].
 * May extend the CLASS_IOFFSETS of the IFACES if need be.
 */

int
find_iindex(Class* ifaces, short* offsets, int num)
{
  int i;
  /* Look for a CLASS_IINDEX (in variable i) that does not conflict
   * with the ones already in use. */
  for (i = 1;  i++; )
    {
      for (j = 0;  ;  j++)
	{
	  if (j >= num)
	    goto found;
	  if (i >= CLASS_IOFFSETS_LENGTH(ifaces[j]))
	    continue;
	  int ioffset = CLASS_IOFFSETS(ifaces[j])[i];
	  if (ioffset >= 0 && ioffset != offsets[j])
            break;  /* Nope - try next i. */
	}
    }
 found:
  for (j = 0;  ;  j++)
    {
      int len = CLASS_IOFFSETS_LENGTH(ifaces[j]);
      if (i >= len)
	{
	  /* FIXME - check off-by-one erors.
	     Should len field be counted in len? */
	  int newlen = 2 * len;
	  if (i >= newlen)
	    newlen = i + 3;
	  short *old_ioffsets = CLASS_IOFFSETS(ifaces[j]);
	  short *new_ioffsets;
	  if (old_ioffsets == null_offset)
	    {
	      new_ioffsets = [gc]malloc(newlen * sizeof(short));
	      new_ioffsets[0] = 1;
	      new_ioffsets[1] = 0;
	    }
	  else
	    {
	      new_ioffsets = [gc]realloc(old_ioffsets, newlen*sizeof(short));
	    }
	  new_ioffsets[0] = newlen;
	  while (len < newlen)
	    new_ioffsets[len++] = -1;
	  CLASS_IOFFSETS(ifaces[j]) = new_ioffsets;
	}
      CLASS_IOFFSETS(ifaces[j])[i] = offsets[j];
    }
  return i;
}

Checking that an object is an instance of a class that implements a given
interface (i.e. OBJ instanceof IFACE) uses the same data structures,
and is also constant time:

#define OBJECT_INSTANCEOF_INTERFACE(OBJ, IFACE) \
  CLASS_INSTANCEOF_INTERFACE(OBJECT_CLASS(OBJ), IFACE)

#define CLASS_INSTANCEOF_INTERFACE(CL, IFACE) \
  (CLASS_IINDEX(CL) <= CLASS_IOFFSETS_LENGTH(IFACE) \
   && ({unsigned short __offset = CLASS_IOFFSETS(IFACE)[CLASS_IINDEX(CL)]; \
        __offset < CLASS_ITABLE_LENGTH(CL) \
        && CLASS_ITABLE(CL)[__offset]] == IFACE}))

This requries three tests.  The first two are just bounds tests.
We can remove the first test if we ensure that there is no CLASS_IINDEX
longer than any CLASS_IOFFSETS_LENGTH.  That imples that all the ioffsets
table have to have the same length:  That of the largest CLASS_IINDEX.
Unused elements of ioffsets could be marked with -1 - that would be safe
as long as the comparsion against CLASS_ITABLE_LENGTH(CL) is unsigned.

Using that data from experiements with classes.zip, that would require a
maximum of 14 bytes for 105 interfaces that do not share null_offset.
In other words:  fairly minor.  More space can be saved by sharing ioffsets
vectors that are identical; by a more clever allocation of iindex values;
and by a more clever re-arrangement of the Partial Interface Dispatch Tables
with the Interface Dispatch Tables.

One complication is that if a new classes is added that requires longer
ioffsets vectors, then the ioffsets of all interfaces have to be relocated.
But this should happen infrequently, and only when a new class is loaded.

CLASS_INSTANCEOF_INTERFACE is may be a trifle long to be inlined.

// Related work:

// Vitek, Horspool and Krall: ""Efficient Type Inclusion Tests".
// OOPSLA 97.  ACM SIGPLAN Notices 32(10:142.  1997.

// Some of the same people are also behind Cacao, a fast JIT for Alpha.
// See http://www.complang.tuwien.ac.at/java/cacao/.
// Cacao uses "coloring":  Each interface or interface is assign a unique
// global index, rather like perfect hashing.  This makes dispatch fast,
// but the per-class lookup tables are not dense.  (In my proposed design,
// the per-class lookup tables are dense, but the per-interface
// ioffset vectors may be sparse.  However, the ioffsets vectors are
// still small and there are fewer interfaces than classes (that implement
// interfaces).  How much space is saved, and if it is worth it,
// is open.)

// It seem slikely that there will be more interfaces and more classes
// implemeting interfaces in JDK 1.1, and code based on it.  For example,
// JDK 1.2 has new collector interfaces (such as List) that may become
// popular.

// Some relevant papers (including overview) are available from University
// of Geneva's Objects Systems Group
// http://cuiwww.unige.ch/OSG/publications/index.html

// Numbers:
JDK 1.1.4 classes.zip
153 interfaces
1458 classes total
1145 classes implement 0 interfaces
 153 classes implement 1 interface (directly or indirectly)
 105 classes implement 2 interfaces (directly or indirectly)
  39 classes implement 3 interfaces (directly or indirectly)
  10 classes implement 4 interfaces (directly or indirectly)
   5 classes implement 5 interfaces (directly or indirectly)
   1 class (java.awt.AWTEventMulticaster) implements 12 interfaces
160 classes implement more than 1 interface
A very simple-minded assignment of ioffsets yields a maximum length of
7 2-byte shorts.  (We could use 1-byte elements of each ioffset table
to save some space, but that might prevent classes that implement many
interfaces with many methods.  There are none such in classes.zip.)

	--Per Bothner
Cygnus Solutions     bothner@cygnus.com     http://www.cygnus.com/~bothner