
Big Data and Information Analytics, 2016, 1(4): 349376. doi: 10.3934/bdia.2016015.
Article
Export file:
Format
 RIS(for EndNote,Reference Manager,ProCite)
 BibTex
 Text
Content
 Citation Only
 Citation and Abstract
MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics
Department of Computing The Hong Kong Polytechnic University Hong Kong, China
Keywords: Big data; parallel programming; matrix computation; machine learning; graph processing
Citation: Yaguang Huangfu, Guanqing Liang, Jiannong Cao. MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics. Big Data and Information Analytics, 2016, 1(4): 349376. doi: 10.3934/bdia.2016015
References:
 [1] C.C. Chang and ChihJen, libsvm dataset url:http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/news20.binary.bz2, 2015.
 [2] J. Choi, J. J. Dongarra, R. Pozo and D. W. Walker, ScaLAPACK:A scalable linear algebra library for distributed memory concurrent computers, in Frontiers of Massively Parallel Computation, 1992., Fourth Symposium on the, IEEE, (1992), 120127.
 [3] Chu, ChengTao and Kim, Sang Kyun and Lin, YiAn and Yu, YuanYuan and Bradski, Gary and Ng, Andrew Y and Olukotun, Kunle, MapReduce for Machine Learning on Multicore, in Neural Information Processing Systems, 2007.
 [4] M. T. Chu and J. L. Watterson, On a multivariate eigenvalue problem, Part I:Algebraic theory and a power method, SIAM Journal on Scientific Computing, 14(1993), 10891106.
 [5] T. H. Cormen, Introduction to Algorithms, MIT press, 2009.
 [6] J. Dean and S. Ghemawat, MapReduce:simplified data processing on large clusters, Communications of the ACM, 51(2008), 107113.
 [7] J. Ekanayake, H. Li and B. Zhang, Twister:A runtime for iterative MapReduce, in HPDC'10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, (2010), 810818.
 [8] J. Gonzalez, Y. Low, H. Gu, D. Bickson and C. Guestrin, PowerGraph:Distributed graphparallel computation on natural graphs, in OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, (2012), 1730.
 [9] P. Harrington, Machine Learning in Action, Manning Publications, 2012.
 [10] P. Hintjens, ZeroMQ:Messaging for Many Applications, O'Reilly Media, Inc., 2013.
 [11] Intel, Threading Building Blocks url:https://www.threadingbuildingblocks.org/, 2009.
 [12] M. Isard, M. Budiu, Y. Yu, A. Birrell and D. Fetterly, Dryad:distributed dataparallel programs from sequential building blocks, ACM SIGOPS Operating Systems Review, 41(2007), 5972.
 [13] Join (SQL) url:https://en.wikipedia.org/wiki/Join, 2015.
 [14] J. Kepner and J. Gilbert, Graph Algorithms in the Language of Linear Algebra, SIAM, 2011.
 [15] K. Kourtis, V. Karakasis, G. Goumas and N. Koziris, CSX:An extended compression format for spmv on shared memory systems, in ACM SIGPLAN Notices, 46(2011), 247256.
 [16] J. Kowalik, ACTORS:A model of concurrent computation in distributed systems (Gul Agha), SIAM Review, 30(1988), 146146.
 [17] C. G. Aapo Kyrola and G. Blelloch, GraphChi:Largescale graph computation on just a PC, in Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation, USENIX Association, (2012), 3146.
 [18] Y. Low, J. Gonzalez and A. Kyrola, Graphlab:A distributed framework for machine learning in the cloud, arXiv preprint, arXiv:1107.0922, 1107(2011).
 [19] Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin and J. M. Hellerstein, Distributed GraphLab:A Framework for Machine Learning and Data Mining in the Cloud, in Proceedings of the VLDB Endowment, 5(2012), 716727.
 [20] G. Malewicz, M. Austern and A. Bik, Pregel:A system for largescale graph processing, Proceedings of the the 2010 international conference on Management of data, 114(2010), 135145.
 [21] D. Murray, F. McSherry, R. Isaacs, M. Isard, P. Barham and M. Abadi, Naiad:A timely dataflow system, in SOSP'13:Proceedings of the TwentyFourth ACM Symposium on Operating Systems Principles, (2013), 439455.
 [22] E. J. O'Neil, P. E. O'Neil and G. Weikum, The LRUK page replacement algorithm for database disk buffering, in ACM SIGMOD Record, 22(1993), 297306.
 [23] T. W. L Page, S Brin, R Motwani, The PageRank Citation Ranking:Bringing Order to the Web, tech. rep., Stanford InfoLab, 1999.
 [24] R. Power and J. Li, Piccolo:Building fast, distributed programs with partitioned tables, Proceedings of the 9th USENIX conference on Operating systems design and implementationOSDI'10, (2010), 114.
 [25] J. Protic, M. Tomasevic and V. Milutinovi´c, Distributed Shared Memory:Concepts and Systems, John Wiley & Sons, 1998.
 [26] Z. Qian, X. Chen, N. Kang and M. Chen, MadLINQ:largescale distributed matrix computation for the cloud, Proceedings of the 7th ACM european conference on Computer Systems. ACM, (2012), 197210,.
 [27] RocksDB, http://rocksdb.org/, 2015.
 [28] A. Roy, I. Mihailovic and W. Zwaenepoel, Xstream:edgecentric graph processing using streaming partitions, in the TwentyFourth ACM Symposium on Operating Systems Principles, (2013), 472488.
 [29] S. Seo, E. J. Yoon, J. Kim, S. Jin, J.S. Kim and S. Maeng, HAMA:An efficient matrix computation with the mapreduce framework, in 2010 IEEE Second International Conference on Cloud Computing Technology and Science, (2010), 721726.
 [30] J. Shun and G. Blelloch, Ligra:A lightweight graph processing framework for shared memory, in PPoPP, (2013), 135146.
 [31] M. S. Snir, S. W. Otto, D. W. Walker, J. Dongarra and HussLederman, MPI:The Complete Reference, MIT Press, 1995.
 [32] L. Valiant, A bridging model for parallel computation, Communications of the ACM, 33(1990), 103111.
 [33] P. Vassiliadis, A survey of extracttransformload technology, International Journal of Data Warehousing and Mining, 5, 127.
 [34] S. Venkataraman, E. Bodzsar, I. Roy, A. AuYoung, and R. S. Schreiber, Presto, in Proceedings of the 8th ACM European Conference on Computer SystemsEuroSys'13, (2013), p197.
 [35] R. S. Xin, J. E. Gonzalez, M. J. Franklin, I. Stoica, and E. AMPLab, GraphX:A Resilient Distributed Graph System on Spark, in First International Workshop on Graph Data Management Experiences and Systems, p. 2, 2013.
 [36] M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker and I. Stoica, Spark:Cluster computing with working sets, HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, (2010), p10.
 [37] M. Zaharia, M. Chowdhury, T. Das and A. Dave, Resilient Distributed Datasets:A FaultTolerant Abstraction for InMemory Cluster Computing, tech. rep., UCB/EECS201182 UC Berkerly, 2012.
 [38] T. Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms, in Proceedings of the twentyfirst international conference on Machine learning, ACM, (2004), p116.
 [39] Y. Zhou, D. Wilkinson, R. Schreiber and R. Pan, Largescale parallel collaborative filtering for the netflix prize, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS, 5034(2008), 337348.
Reader Comments
Copyright Info: 2016, Jiannong Cao, et al., licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution Licese (http://creativecommons.org/licenses/by/4.0)
Associated material
Metrics
Other articles by authors
Related pages
Tools
your name: * your email: *