Order reprints

MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics

Yaguang Huangfu Guanqing Liang Jiannong Cao

*Corresponding author: Jiannong Cao csjcao@comp.polyu.edu.hk


The computation core of many big data applications can be expressed as general matrix computations, including linear algebra operations and irregular matrix operations. However, existing parallel programming systems such as Spark do not have programming abstraction and efficient implementation for general matrix computations. In this paper, we present MatrixMap, a unified and efficient data-parallel programming framework for general matrix computations. MatrixMap provides powerful yet simple abstraction, consisting of a distributed in-memory data structure called bulk key matrix and a programming interface defined by matrix patterns. Users can easily load data into bulk key matrices and program algorithms into parallel matrix patterns. MatrixMap outperforms current state-of-the-art systems by employing three key techniques:matrix patterns with lambda functions for irregular and linear algebra matrix operations, asynchronous computation pipeline with context-aware data shuffling strategies for specific matrix patterns and in-memory data structure reusing data in iterations. Moreover, it can automatically handle the parallelization and distribute execution of programs on a large cluster. The experiment results show that MatrixMap is 12 times faster than Spark.

Please supply your name and a valid email address you yourself

Fields marked*are required

Article URL   http://www.aimspress.com/BDIA/article/1961.html
Article ID   2380-6966_2016_4_349
Editorial Email  
Your Name *
Your Email *
Quantity *

Copyright © AIMS Press All Rights Reserved