FatMan vs. LittleBoy: Scaling up Linear Algebraic Operations in Scale-out Data Platforms...

by Seung-hwan Lim, Luna Xu, Ali Butt, Sreenivas R Sukumar, Ramakrishnan Kannan

Publication Type

Conference Paper

Publication Date

November, 2016

Conference Name

1ST JOINT INTERNATIONAL WORKSHOP ON PARALLEL DATA STORAGE & DATA INTENSIVE SCALABLE COMPUTING SYSTEMS

Conference Location

Salt Lake City, Utah, United States of America

Conference Date

Nov 14, 2016 - Nov 14, 2016

Abstract

Linear algebraic operations such as matrix manipulations form the kernel of many machine learning and other crucial algorithms.
Scaling up as well as scaling out such algorithms are highly desirable to enable efficient processing over millions of data points.
To this end, we present a matrix manipulation approach to effectively scale-up each node in a scale-out data parallel platform such as Apache Spark. Specifically, we enable hardware acceleration for matrix multiplications in a distributed Spark setup without user intervention. Our approach supports both dense and sparse distributed matrices, and provides flexible control of acceleration by matrix density.
We demonstrate the benefit of our approach for generalized matrix multiplication operations over large matrices with up to four billion elements.
To connect the effectiveness of our approach with machine learning applications, we performed Gramian matrix computation via generalized matrix multiplications.
Our experiments show that our approach achieves more than 2x performance speed-up, and up to 96.1% computation improvement, compared to a state of the art Spark MLlib for dense matrices.

FatMan vs. LittleBoy: Scaling up Linear Algebraic Operations in Scale-out Data Platforms...

Abstract

Researchers

Organizations