Improved Analysis of Unstructured Datasets using Thesaurus Model
DOI:
https://doi.org/10.26438/ijcse/v7i2.10331037Keywords:
Hadoop, MapReduce, HDFS, NoSQLAbstract
Humankind has put away in excess of 295 billion gigabytes (or 295 Exabyte) of information beginning around 1986, according to a report by the University of Southern California. Putting away and checking this information in generally disseminated conditions for all day, every day is an enormous errand for worldwide assistance associations. These datasets require high handling power which can't be presented by conventional information bases as they are put away in an unstructured arrangement. Although one can utilize Map Reduce worldview to take care of this issue utilizing java-based Hadoop, it can't give us with most extreme usefulness. Downsides can be defeated utilizing Hadoop-streaming methods that permit clients to characterize non-java executable for handling this dataset. This paper proposes a THESAURUS model which permits a quicker and more straightforward form of business examination.
References
[1] Apache Hadoop.[Online].Available: http://hadoop.apache.org
[2] Apache Hadoop-Streaming.[Online].:http://hadoop- streaming.apache.org
[3] Cassandra wiki, operations. [Online]. Available: http://wiki.apache.org/cassandra/Operations
[4] NOSQL data storage [online]: http://nosql-database.org
[5] E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, and L. Ramakrishnan, “A processing pipeline for cassandra datasets based on Hadoop streaming,” in Proc. IEEE Big Data Conf., Res. Track, Anchorage, AL, USA, pp. 168–175,2014.
[6] E. Dede, B. Sendir, P. Kuzlu, J. Weachock, M. Govindaraju, L. Ramakrishnan, "Processing Cassandra Datasets with Hadoop-Streaming Based Approaches",IEEE Transactions on Services Computing, Vol. 9,Issue 1,pp 46-58.
[7] J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae,J. Qiu, and G. Fox, “Twister: A runtime for iterative mapreduce,” in Proc. 19th ACMInt. Symp. High Perform. Distrib. Comput., pp. 810–818,2010
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
