An Integrated Compiler/Run-Time System for Global Data Distribution

Students

Greg Howard (now at SGI)

Don Morris (now at HP)

Other affiliated people

Frank Lowenthal (CSU Hayward; no relation).

OK, I guess I'd better snap out of my teenage denial; he's my dad.

Papers

David K. Lowenthal, Vincent W. Freeh, and David W. Miller. Efficient Support for Two-Dimensional Data Distributions in Distributed Shared Memory Systems. International Parallel and Distributed Processing Symposium, April 2002. PS , PDF

Donald G. Morris III and David K. Lowenthal. Accurate Data Redistribution Cost Estimation in Software Distributed Shared Memory Systems. Principles and Practice of Parallel Parallel Programming, June 2001. PS

Gregory M.S. Howard and David K. Lowenthal. An Integrated Compiler/Run-Time System for Global Data Distribution in Distributed Shared Memory Systems. Second Workshop on Software Distributed Shared Memory Systems , 2000. PS

David K. Lowenthal. Local and Global Data Distribution in the Filaments Package. Proceedings of the 4th International Conference on Parallel and Distributed Processing Techniques and Applications, p. 33-41, July 1998. PS , PDF

Project Overview

National Science Foundation CAREER Award, ``An Integrated Compiler/Run-Time System for Global Data Distribution" NSF Grant CCR-9733063 funded by the Operating Systems and Compilers program, July;1998-June;2002, $199,829.

While significant strides have been made in exploiting distributed-memory multicomputers for high-performance scientific computing, the emergence of large-scale and adaptive applications requires advances in automatic methods to distribute data among the processors. Poor data distributions lead to reduced speedups due to excess communication, an imbalanced computational workload, or both. Although compile-time analysis has been developed to distribute data automatically, the increasing complexity of scientific programs limits its effectiveness. For example, all compile-time algorithms for data distribution depend on an accurate determination of the computational workload. However, the workload of many scientific applications is dependent on run-time parameters, which makes the use of run-time information necessary to find a good data distribution.

The focus of this research project is to develop an integrated compiler and run-time system for data distribution. Research will progress along two main fronts. First, we will develop the necessary compiler analysis that will be used by the run-time system. For example, large-scale applications are generally composed of several phases, each of which has unique characteristics that may require a different data distribution. However, because program behavior is generally uniform within a phase, redistribution is typically only appropriate between phases. Therefore, we will implement compiler analysis to divide a program into its component phases; these will be used by the run-time system as the only points at which data redistribution will be considered. The compiler will also be modified to pass useful hints to the run-time system about intra-phase behavior, such as the communication pattern, which will be used to choose initial distributions and reduce run-time overhead. Second, we will develop the necessary run-time analysis. This work will build on an existing run-time data distribution system, Adapt, that can monitor workload and communication patterns of single-phase applications and, when appropriate, change the data distribution at run time. We will modify the run-time system to handle multi-phase applications and make use of compiler annotations. More importantly, we will develop a framework for determining at run time an optimal global data distribution over a reasonable set of distributions. This means that we will find a distribution for each array in each program phase, with possible redistribution between phases, that minimizes application completion time. Furthermore, if application characteristics change during execution, our system will change the distribution accordingly. For evaluation, we will test the performance of applications developed using our integrated system against both hand-coded and compiler-generated versions. This will include testing large-scale applications being developed at the University of Georgia. We expect our integrated system to perform competitively on applications where compilers can effectively infer a data distribution. Furthermore, we expect our system to perform better than a compiler when an application either is too complex to analyze statically or depends on run-time parameters.



David Lowenthal