Skip to main content
SHARE
Publication

FatMan vs. LittleBoy: Scaling up Linear Algebraic Operations in Scale-out Data Platforms...

by Seung-hwan Lim, Luna Xu, Ali Butt, Sreenivas R Sukumar, Ramakrishnan Kannan
Publication Type
Conference Paper
Publication Date
Conference Name
1ST JOINT INTERNATIONAL WORKSHOP ON PARALLEL DATA STORAGE & DATA INTENSIVE SCALABLE COMPUTING SYSTEMS
Conference Location
Salt Lake City, Utah, United States of America
Conference Date
-

Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms.
Scaling up as well as scaling out such algorithms are highly desirable to enable efficient processing over millions of data points.
To this end, we present a matrix manipulation approach to effectively scale-up each node in a scale-out data parallel platform such as Apache Spark. Specifically, we enable hardware acceleration for matrix multiplications in a distributed Spark setup without user intervention. Our approach supports both dense and sparse distributed matrices, and provides flexible control of acceleration by matrix density.
We demonstrate the benefit of our approach for generalized matrix multiplication operations over large matrices with up to four billion elements.
To connect the effectiveness of our approach with machine learning applications, we performed Gramian matrix computation via generalized matrix multiplications.
Our experiments show that our approach achieves more than 2x performance speed-up, and up to 96.1% computation improvement, compared to a state of the art Spark MLlib for dense matrices.