Scal-A: Detecting and Alleviating Sources of Scalability Problems


The focus of this project is to develop tool support to provide the ability for scientific programmers to inquire about scalability problems and correlate this information back to source code. Furthermore, we believe that tools should be able to suggest and evaluate optimizing transformations to alleviate these problems. This would constitute a significant improvement over current performance analysis practice.

The key intellectual merit is in providing an automatic framework for detecting scalability problems and correlating them back to source code. We will experiment with our framework on the ASCI codes, which is intended to stress high-performance clusters.

In addition, we have investigated how to predict scalability on large numbers of nodes through several experimental runs on smaller numbers of nodes. The basic idea is to use statistical as well as machine learning methods to extrapolate from smaller node configurations to larger ones. We have completed important work on this topic, which has lead to a whole new set of problems in handling variance and designing experiments (with help from collaborators in Statistics).

The broader impact of this work is in three main areas. First, both PIs are working to create an interdisciplinary educational and research program. Second, students will be educated in high-performance computing. Finally, the proposed work allows for technology transfer to a wide arena of emerging fields, such as cluster computing as well as the established areas of SMPs and massively parallel computing. The developed framework and tools will be made generally available to the research community and high-performance computing labs.

Publications:

Vincent W. Freeh, Nandani Kappiah, David K. Lowenthal, and Tyler Bletsch.
Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs.
Journal of Parallel and Distributed Computing, 2008.

Vincent W. Freeh, David K. Lowenthal, Feng Pan, Robert Springer, Nandani Kappiah, Barry Rountree, and Mark Femal.
Analyzing the Energy-Time Tradeoff in High Performance Computing Applications. PDF
IEEE Transactions on Parallel and Distributed Systems, 5(11): 1575--1590 (2006).

Brad Barnes, Barry Rountree, David K. Lowenthal, Jaxk Reeves, Bronis de Supinski, and Martin Schulz.
A Regression-Based Approach to Scalability Prediction.
PDF
International Conference on Supercomputing (ICS), June 2008.

Barry Rountree, David K. Lowenthal, Shelby H. Funk, Vincent W. Freeh, Bronis R. de Supinski, and Martin Schulz.
Bounding Energy Consumption in Large-Scale MPI Programs.
PDF
IEEE/ACM Supercomputing 2007 (SC '07), November 2007.

Min Yeol Lim, Vincent W. Freeh, and David K. Lowenthal.
Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs.
PDF
IEEE/ACM Supercomputing 2006 (SC '06), November 2006.

Rob Springer, David K. Lowenthal, Barry Rountree, and Vincent W. Freeh.
Minimizing Execution Time in MPI Programs on an Energy-Constrained, Power-Scalable Cluster.
PDF
11th ACM Symposium on Principles and Practice of Parallel Programming (PPOPP), March 2006.

Nandani Kappiah, Vincent W. Freeh, and David K. Lowenthal.
Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs.
PDF
IEEE/ACM Supercomputing 2005 (SC '05), November 2005.

Vincent W. Freeh, Feng Pan, David K. Lowenthal, and Nandani Kappiah.
Using Multiple Energy Gears in MPI Programs on a Power-Scalable Cluster.
PDF
10th ACM Symposium on Principles and Practice of Parallel Programming (PPOPP), June 2005.

Vincent W. Freeh, David K. Lowenthal, Robert Springer, Feng Pan, and Nandani Kappiah.
Exploring the Energy-Time Tradeoff in MPI Programs.
PDF
19th IEEE/ACM International Parallel and Distributed Processing Symposium (IPDPS), April 2005.


"This material is based upon work supported by the National Science Foundation under Grant No. 0429285."

"Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation."