Titanium: A High-Performance Parallel Java Dialect

Dan Bonachea, Kaushik Datta, Ed Givelberg1, Sabrina Merchant, Geoff Pike2, and Jimmy Su
(Professors Susan L. Graham, Paul N. Hilfinger, and Katherine A. Yelick)
(ASCI-3 LLNL) W-7405-ENG-48 and (NSF) PACI ACI-9619020

Titanium is an explicitly parallel dialect of Java developed at UC Berkeley [1] to support high-performance scientific computing on large-scale multiprocessors, including massively parallel supercomputers and distributed-memory clusters with one or more processors per node. Other language goals include safety, portability, and support for building complex data structures.

The main additions [2] to Java are:

Titanium provides a global memory space abstraction (similar to other languages such as Split-C and UPC) whereby all data has a user-controllable processor affinity, but parallel processes may directly reference each other's memory to read and write values or arrange for bulk data transfers. A specific portability result is that Titanium programs can run unmodified on uniprocessors, shared memory machines, and distributed memory machines. Performance tuning may be necessary to arrange an application's data structures for distributed memory, but the functional portability allows for development on shared memory machines and uniprocessors.

Titanium is a superset of Java and inherits all the expressiveness, usability, and safety properties of that language. Titanium augments Java's safety features by providing checked synchronization that prevents certain classes of synchronization bugs. To support complex data structures, Titanium uses the object-oriented class mechanism of Java along with the global address space to allow for large shared structures. Titanium's multidimensional array facility adds support for high-performance hierarchical and adaptive grid-based computations.

Our compiler research focuses on the design of program analysis techniques and optimizing transformations for Titanium programs, and on developing a compiler and run-time system that exploit these techniques. Because Titanium is an explicitly parallel language, new analyses are needed even for standard code motion transformations. The compiler analyzes both synchronization constructs and shared variable accesses. Transformations include cache optimizations, overlapping communication, identifying references to objects on the local processor, and replacing runtime memory management overhead with static checking. Our current implementation translates Titanium programs entirely into C, where they are compiled to native binaries by a C compiler and then linked to the Titanium runtime libraries (there is no JVM).

The current implementation runs on a wide range of platforms, including uniprocessors, shared memory multiprocessors, distributed-memory clusters of uniprocessors or SMPs (CLUMPS), and a number of specific supercomputer architectures (Cray T3E, IBM SP, Origin 2000). The distributed memory back-ends can use a wide variety of high-performance network interconnects, including Active Messages, MPI, IBM LAPI, shmem, and UDP.

Titanium is especially well adapted for writing grid-based scientific parallel applications, and several such major applications have been written and continue to be further developed ([3,4], and many others).

[1]
Titanium Project home page: http://titanium.cs.berkeley.edu.
[2]
P. Hilfinger et al., Titanium Language Reference Manual, UC Berkeley Computer Science Division, Report No. UCB/CSD 01/1163, November 2001.
[3]
G. Balls and P. Colella, A Finite Difference Domain Decomposition Method Using Local Corrections for the Solution of Poisson's Equation, Lawrence Berkeley National Laboratory Report No. LBNL-45035, 2001.
[4]
G. Pike, L. Semenzato, P. Colella, and P. Hilfinger, "Parallel 3D Adaptive Mesh Refinement in Titanium," Proc. SIAM Conf. Parallel Processing for Scientific Computing, San Antonio, TX, March 1999.
1Postdoctoral Researcher
2Postdoctoral Researcher

More information (http://titanium.cs.berkeley.edu/) or

Send mail to the author : (bonachea@eecs.berkeley.edu)


Edit this abstract