Skip to main content
SHARE
Publication

Density-fitted singles and doubles coupled cluster on graphics processing units...

by David Sherrill, Bobby G Sumpter, Iii A. Eugene Deprince
Publication Type
Journal
Journal Name
Molecular Physics
Publication Date
Page Numbers
844 to 852
Volume
112
Issue
5-6

We adapt an algorithm for singles and doubles coupled cluster (CCSD) that uses density fitting
(DF) or Cholesky decomposition (CD) in the construction and contraction of all electron
repulsion integrals (ERI’s) for use on heterogeneous compute nodes consisting of a multicore
CPU and at least one graphics processing unit (GPU). The use of approximate 3-index ERI’s
ameliorates two of the major difficulties in designing scientific algorithms for GPU’s: (i) the
extremely limited global memory on the devices and (ii) the overhead associated with data
motion across the PCI bus. For the benzene trimer described by an aug-cc-pVDZ basis set,
the use of a single NVIDIA Tesla C2070 (Fermi) GPU accelerates a CD-CCSD computation
by a factor of 2.1, relative to the multicore CPU-only algorithm that uses 6 highly efficient
Intel core i7-3930K CPU cores. The use of two Fermis provides an acceleration of 2.89, which
is comparable to that observed when using a single NVIDIA Kepler K20c GPU (2.73).