Parallel Aggregation on Sharded Clusters

Authors

  • T Mothilal Department of Computer Science and Engineering, Prasad V Potluri Siddhartha Institute of Technology, Kanuru, India
  • Anil Kumar P Department of Computer Science and Engineering, Prasad V Potluri Siddhartha Institute of Technology, Kanuru, India

Keywords:

NOSQL, MongoDB, Sharding, Parallelism, MapReduce

Abstract

The data is organized by databases with the help of database management systems. DBMS is the collection of schemas, queries and other objects. To aggregate the data DBMS used Cartesian products between two or more tables and produce a result in a logical table. Where data is increasing rapidly day by day, so writing joins on large tables is difficult to data analysts and manage complex queries on large scale table is quite difficult to DBMS. To reduce complexity of manipulating large data schemaless databases are introduced. MongoDB process schemaless data and having more use cases to achieve parallel processing on data. Aggregation is one of the function which is applying on the data. To get fastest aggregation results use mongodb sharded cluster and mareduce.

References

J. M. Hellerstein, “The case for online aggregation”, Technical Report UCB//CSD-96-908, EECS Computer Science Division, University of California, Berkeley, CA,1996

Jeffrey Dean and Sanjay Ghemawat “MapReduce: Simplified Data Processing on Large Clusters”, OSDI 2014

Anju abraha,"A Dynamic Query Form System for Mongodb", SSRG-IJCSE, volume-1 issue-9, Nov 2014.

MongoDB, “http://docs.mongodb.org/manual/”, Thursday, April 30, 2015.

MongoDB, http://www.tutorialspoint.com/mongodb/, Monday, July 6, 2015

Replication, http://stackoverflow.com, Tuesday, August 11, 2015

Sharding, http://gist.github.com, Monday, August 17, 2015.

Downloads

Published

2025-11-10

How to Cite

[1]
T. Mothilal and A. K. P, “Parallel Aggregation on Sharded Clusters”, Int. J. Comp. Sci. Eng., vol. 3, no. 9, pp. 39–43, Nov. 2025.

Issue

Section

Research Article