Parallel I/O- and Communication-Sensitive Scheduling on High-Performance Parallel Computers

Jens Mache

Abstract:
A serious bottleneck to high-performance parallel computing is the high cost of data transfer. As communication and I/O traffic on the links in the interconnection network increases, network contention becomes a critical problem, drastically reducing effective data throughput. Minimizing network contention is crucial in order to achieve fast job response times and high system throughput.

One factor that affects network contention is the resource management issue of processor allocation, the assignment of a set of processors to each scheduled job. Previous processor allocation strategies have essentially ignored the I/O and communication demands of parallel applications and the resulting network contention. Our analysis shows that the spatial layout of the compute nodes and the I/O nodes in relation to each other within the interconnection network topology is a key factor that affects network contention. Based on the results of this analysis, we design and test new processor allocation strategies that minimize network contention by being sensitive to spatial layout and its effect on communication and parallel I/O.

Our analysis is based on analytic modeling and on simulations driven by synthetic workloads and realistic workload traces captured at supercomputing sites. We analyze and minimize network contention in three different situations. First, we concentrate on communication intensive jobs. We analyze inter-job link contention due to communication among compute nodes, and we design a new strategy that allocates each job as compactly as possible. Second, we concentrate on parallel I/O intensive jobs. We analyze traffic hotspots due to data transfer between compute nodes and I/O nodes, and we design new strategies that optimize the shape and location of jobs relative to the I/O nodes. Strategies that are optimal for parallel I/O often conflict with strategies that are optimal for communication. Thus as a final step, we design an integrated allocation strategy that accommodates workloads that are both communication intensive and parallel I/O intensive. Our new strategies are successful in improving both average job response times and system throughput, and thus make a contribution towards efficient resource management of teraflops-scale computing systems.

Dissertation Committee:: Virginia M. Lo (chair), University of Oregon; Allen Malony, University of Oregon; Sharad Garg, Intel Corp.; Marilynn Livingston, Southern Illinois University; Andrzej Proskurowski, University of Oregon; Richard Koch, University of Oregon

successfully defended January 6, 1999