Skip to main content
SHARE
Research Highlight

Enabling Portability and GPU Direct on DCA++

The hardware layout of half of a single summit node. This work focused on exploiting the complexity of 2 CPU and 6 GPU-node architecture to its maximum capability by further using the NVLink ( fast network connection between GPUs on Summit). Computer Science and Mathematics Computational Sciences and Engineering CSMD CSED ORNL
The hardware layout of half of a single summit node. This work focused on exploiting the complexity of 2 CPU and 6 GPU-node architecture to its maximum capability by further using the NVLink ( fast network connection between GPUs on Summit).

Science

A team of ORNL researchers has used the DCA++ application, a popular code for predicting the performance of quantum materials, to verify two performance-enhancing strategies. First, the team improved the Quantum Monte Carlo solver, a tool common across the DOE application landscape, and built a portability layer to Frontier, the OLCF’s first exascale system to launch in 2021. Second, the team demonstrated that GPU Direct, a feature available on the NVIDIA Volta GPUs that populate ORNL’s Summit, can overcome the lost performance that haunts “fat node” systems, which relay communications between GPUs through CPUs.

Impact

The scientific impact of the research is three-fold:

  • The developed portability layer will help DOE researchers that rely on QMC solvers port more easily to Frontier and enhance the performance of their codes at the exascale.
  • The demonstration of direct GPU-to-GPU communication proved that GPU Direct can be used to enhance code performance while maintaining the accuracy of the science. The performance benefit outweighs the resource required (developer’s time, code modifications).
  • The increased efficiency enables more accurate modeling of quantum materials.

PI(s)/Facility Lead(s): Thomas Maier
ASCR Program/Facility: OLCF
Funding: OLCF, SciDAC
Publication(s) for this work: W. Wei, A. Chatterjee, E. D’Azevedo, O. Hernandez, H. Kaiser. “Enabling GPU Direct RDMA on Quantum Monte Carlo applications”.