A Programming Model for Massive Data Parallelism with Data Dependencies...

by Xiaohui Cui, Frank Mueller, Thomas E Potok, Yongpeng Zhang

Publication Type

Conference Paper

Publication Date

September, 2009

Conference Name

Parallel Architectures and Compilation Techniques (PACT)

Conference Location

Raleigh, North Carolina, United States of America

Conference Date

Sep 12, 2009

Abstract

Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA’s Compute Unified Device
Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs.
One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains.

A Programming Model for Massive Data Parallelism with Data Dependencies...

Abstract

Researchers

Organizations