How do I choose the right NoSQL solution? A comprehensive theoretical and experimental survey 

  • With the advent of the Internet of Things (IoT) and cloud computing, the need for data stores that would be able to store and process big data in an efficient and cost-effective manner has increased dramatically. Traditional data stores seem to have numerous limitations in addressing such requirements. NoSQL data stores have been designed and implemented to address the shortcomings of relational databases by compromising on ACID and transactional properties to achieve high scalability and availability. These systems are designed to scale to thousands or millions of users performing updates, as well as reads, in contrast to traditional RDBMSs and data warehouses. Although there is a plethora of potential NoSQL implementations, there is no one-sizefit-all solution to satisfy even main requirements. In this paper, we explore popular and commonly used NoSQL technologies and elaborate on their documentation, existing literature and performance evaluation. More specifically, we will describe the background, characteristics, classification, data model and evaluation of NoSQL solutions that aim to provide the capabilities for big data analytics. This work is intended to help users, individuals or organizations, to obtain a clear view of the strengths and weaknesses of well-known NoSQL data stores and select the right technology for their applications and use cases. To do so, we first present a systematic approach to narrow down the proper NoSQL candidates and then adopt an experimental methodology that can be repeated by anyone to find the best among short listed candidates considering their specific requirements.

    Citation: Hamzeh Khazaei, Marios Fokaefs, Saeed Zareian, Nasim Beigi-Mohammadi, Brian Ramprasad, Mark Shtern, Purwa Gaikwad, Marin Litoiu.  How do I choose the right NoSQL solution? A comprehensive theoretical and experimental survey [J]. Big Data and Information Analytics, 2016, 1(2): 185-216. doi: 10.3934/bdia.2016004

    Related Papers:

  • With the advent of the Internet of Things (IoT) and cloud computing, the need for data stores that would be able to store and process big data in an efficient and cost-effective manner has increased dramatically. Traditional data stores seem to have numerous limitations in addressing such requirements. NoSQL data stores have been designed and implemented to address the shortcomings of relational databases by compromising on ACID and transactional properties to achieve high scalability and availability. These systems are designed to scale to thousands or millions of users performing updates, as well as reads, in contrast to traditional RDBMSs and data warehouses. Although there is a plethora of potential NoSQL implementations, there is no one-sizefit-all solution to satisfy even main requirements. In this paper, we explore popular and commonly used NoSQL technologies and elaborate on their documentation, existing literature and performance evaluation. More specifically, we will describe the background, characteristics, classification, data model and evaluation of NoSQL solutions that aim to provide the capabilities for big data analytics. This work is intended to help users, individuals or organizations, to obtain a clear view of the strengths and weaknesses of well-known NoSQL data stores and select the right technology for their applications and use cases. To do so, we first present a systematic approach to narrow down the proper NoSQL candidates and then adopt an experimental methodology that can be repeated by anyone to find the best among short listed candidates considering their specific requirements.


    加载中
    [1] [ Y. Abubakar, T. S. Adeyi and I. G. Auta, Performance evaluation of nosql systems using ycsb in a resource austere environment, Performance Evaluation, 7(2014), 23-27.
    [2] [ P. Andlinger, 2015, URLhttp://db-engines.com/en/blog_post/43.
    [3] [ Apache Software Foundation, Apache tinkerpop, 2015, URLhttp://tinkerpop.incubator. apache.org.
    [4] [ Apache Software Foundation, Technical overview of apache couchdb, 2015, URLhttp://wiki. apache.org/couchdb/TechnicalOverview.
    [5] [ ArangoDB GmbH, Arangodb documentation, 2015, URLhttps://www.arangodb.com/documentation.
    [6] [ Aurelius LLC, Titan architecture overview, 2015, URLhttp://s3.thinkaurelius.com/docs/titan/0.9.0-M2/arch-overview.html.
    [7] [ Basho Technologies, Inc, Riak docs, 2015, URLhttp://docs.basho.com/riak/latest/intro-v20.
    [8] [ M. Burrows, The chubby lock service for loosely-coupled distributed systems, in Proceedings of the 7th symposium on Operating systems design and implementation, USENIX Association, 2006, 335-350.
    [9] [ R. Casado and M. Younas, Emerging trends and technologies in big data processing, Concurrency and Computation:Practice and Experience, 27(2015), 2078-2091.
    [10] [ R. Cattell, Scalable sql and nosql data stores, ACM SIGMOD Record, 39(2010), 12-27.
    [11] [ F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes and R. E. Gruber, Bigtable:A distributed storage system for structured data, ACM Transactions on Computer Systems (TOCS), 26(2008), p4.
    [12] [ B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan and R. Sears, Benchmarking cloud serving systems with ycsb, in Proceedings of the 1st ACM symposium on Cloud computing, ACM, 2010, 143-154.
    [13] [ S. Edlich, A. Friedland, J. Hampe, B. Brauer, M. Brückner, S. Edlich, A. Friedland, J. Hampe, B. Brauer and M. Brückner, Nosql.
    [14] [ A. Feinberg, Project voldemort:Reliable distributed storage, in Proceedings of the 10th IEEE International Conference on Data Engineering, 2011.
    [15] [ B. Fitzpatrick, Distributed caching with memcached, Linux journal, 2004(2004), p5.
    [16] [ S. K. Gajendran, A survey on nosql databases, University of Illinois.
    [17] [ J. Gray, Graysort benchmark, 2015, URLhttp://sortbenchmark.org.
    [18] [ Hibernating Rhinos., Ravendb-the open source nosql database for.NET, 2015, URLhttp://ravendb.net/docs/article-page/3.0/csharp/start/getting-started.
    [19] [ Hypertable Inc, Hypertable, 2014, URLhttp://hypertable.org/.
    [20] [ S. IT, Knowledge base of relational and nosql database management systems, 2015, URLhttp://db-engines.com.
    [21] [ S. IT, System properties comparison neo4j vs. orientdb vs. titan, 2015, URLhttp://db-engines.com/en/system/Neo4j.
    [22] [ J. Jose, H. Subramoni, M. Luo, M. Zhang, J. Huang, M. Wasi-ur Rahman, N. S. Islam, X. Ouyang, H. Wang, S. Sur et al., Memcached design on high performance rdma capable interconnects, in Parallel Processing (ICPP), 2011 International Conference on, IEEE, 2011, 743-752.
    [23] [ S. Jouili and V. Vansteenberghe, An empirical comparison of graph databases, in Social Computing (SocialCom), 2013 International Conference on, 2013, 708-715.
    [24] [ J. Klein, I. Gorton, N. Ernst, P. Donohoe, K. Pham and C. Matser, Performance evaluation of nosql databases:A case study, in Proceedings of the 1st Workshop on Performance Analysis of Big Data Systems, PABS'15, ACM, New York, NY, USA, 2015, 5-10.
    [25] [ LinkedIn, Project voldemort, 2015, URLhttp://www.project-voldemort.com.
    [26] [ R. C. McColl, D. Ediger, J. Poovey, D. Campbell and D. A. Bader, A performance evaluation of open source graph databases, in Proceedings of the First Workshop on Parallel Programming for Analytics Applications, PPAA'14, ACM, New York, NY, USA, 2014, 11-18.
    [27] [ MongoDB Inc., Mongodb 3.0 manual, 2015, URLhttp://docs.mongodb.org/manual.
    [28] [ A. Moniruzzaman and S. A. Hossain, Nosql database:New era of databases for big data analytics-classification, characteristics and comparison, arXiv preprint arXiv:1307.0191.
    [29] [ M. A. Olson, K. Bostic and M. I. Seltzer, Berkeley db., in USENIX Annual Technical Conference, FREENIX Track, 1999, 183-191.
    [30] [ Orient Technologies, Top 10 key advantages for going with orientdb, 2015, URLhttp://orientdb.com/why-orientdb/.
    [31] [ A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden and M. Stonebraker, A comparison of approaches to large-scale data analysis, in Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, ACM, 2009, 165-178.
    [32] [ D. Pritchett, Base:An acid alternative, Queue, 6(2008), 48-55.
    [33] [ T. Rabl, A. Ghazal, M. Hu, A. Crolotte, F. Raab, M. Poess and H.-A. Jacobsen, Bigbench specification v0. 1, in Specifying Big Data Benchmarks, Springer, 2014, 164-201.
    [34] [ RedisLabs, Redis, 2015, URLhttp://redis.io/documentation.
    [35] [ SAVI, Smart Applications on Virtual Infrastructure, Cloud platform, 2015, URLhttp://www. savinetwork.ca.
    [36] [ S. Sivasubramanian, Amazon dynamodb:A seamlessly scalable non-relational database service, in Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, ACM, 2012, 729-730.
    [37] [ C. Strozzi, Nosql-a relational database management system, 2015, URLhttp://www.strozzi. it/cgi-bin/CSA/tw7/I/en_US/nosql/HomePage.
    [38] [ Technology, Cypher query language, 2015, URLhttp://neo4j.com/docs/stable/cypher-query-lang.html.
    [39] [ The Apache Foundation, Apache accumulo, 2015, URLhttp://accumulo.apache.org/.
    [40] [ The Apache Foundation, Welcome to apache cassandra, 2015, URLhttp://cassandra. apache.org/.
    [41] [ The Apache Foundation, Welcome to apache hbase, 2015, URLhttp://hbase.apache.org/.
    [42] [ A. Tizghadam and A. Leon-Garcia, Connected Vehicles and Smart Transportation-CVST Platform, 2015, URLhttp://cvst.ca/wp/wp-content/uploads/2015/06/CVST.pdf.
    [43] [ G. Vaish, Getting started with NoSQL, Packt Publishing Ltd, 2013.
    [44] [ vsChart.com, The comparison wiki:Database list, 2015, URLhttp://vschart.com/list/database/.
    [45] [ P. Wiki, Pig mix benchmark, 2015, URLhttps://cwiki.apache.org/confluence/display/PIG/PigMix.
  • Reader Comments
  • © 2016 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(12043) PDF downloads(954) Cited by(8)

Article outline

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog