Skip to main content
SHARE
Publication

A Programming Model for Massive Data Parallelism with Data Dependencies...

by Xiaohui Cui, Frank Mueller, Thomas E Potok, Yongpeng Zhang
Publication Type
Conference Paper
Publication Date
Conference Name
Parallel Architectures and Compilation Techniques (PACT)
Conference Location
Raleigh, North Carolina, United States of America
Conference Date

Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA’s Compute Unified Device
Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs.
One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains.