Parallel Aggregation on Sharded Clusters
Keywords:
NOSQL, MongoDB, Sharding, Parallelism, MapReduceAbstract
The data is organized by databases with the help of database management systems. DBMS is the collection of schemas, queries and other objects. To aggregate the data DBMS used Cartesian products between two or more tables and produce a result in a logical table. Where data is increasing rapidly day by day, so writing joins on large tables is difficult to data analysts and manage complex queries on large scale table is quite difficult to DBMS. To reduce complexity of manipulating large data schemaless databases are introduced. MongoDB process schemaless data and having more use cases to achieve parallel processing on data. Aggregation is one of the function which is applying on the data. To get fastest aggregation results use mongodb sharded cluster and mareduce.
References
J. M. Hellerstein, “The case for online aggregation”, Technical Report UCB//CSD-96-908, EECS Computer Science Division, University of California, Berkeley, CA,1996
Jeffrey Dean and Sanjay Ghemawat “MapReduce: Simplified Data Processing on Large Clusters”, OSDI 2014
Anju abraha,"A Dynamic Query Form System for Mongodb", SSRG-IJCSE, volume-1 issue-9, Nov 2014.
MongoDB, “http://docs.mongodb.org/manual/”, Thursday, April 30, 2015.
MongoDB, http://www.tutorialspoint.com/mongodb/, Monday, July 6, 2015
Replication, http://stackoverflow.com, Tuesday, August 11, 2015
Sharding, http://gist.github.com, Monday, August 17, 2015.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
