Publication Type
Journal
Journal Name
Engineering Analysis with Boundary Elements
Publication Date
Page Numbers
1246 to 1255
Volume
36
Issue
8
Abstract
A collocation boundary element code for solving
the three-dimensional Laplace equation, publicly available from
\url{http://www.intetec.org}, has been adapted to run on an Nvidia Tesla
general purpose graphics processing unit (GPU). Global matrix assembly
and LU factorization of the resulting dense matrix were performed on
the GPU. Out-of-core techniques were used to solve problems larger than
available GPU memory. The code achieved over eight times speedup in
matrix assembly and about 56~Gflops/sec in the LU factorization using
only 512~Mbytes of GPU memory. Details of the GPU implementation and
comparisons with the standard sequential algorithm are included to
illustrate the performance of the GPU code.