Large Scale Deduplication Analysis Using Multigraph Pattern Matching Algorithm

Authors

  • Doss SAAN Department of Computing, Muscat College, Muscat, Oman
  • Jeevitha P Department of Computing, Muscat College, Muscat, Oman

DOI:

https://doi.org/10.26438/ijcse/v6i5.654658

Keywords:

deduplication, Slicing, Overlapping Clustering, multi graph pattern matching algorithm (MGPMA), Bloom filter

Abstract

As information is rising each day, thus it's terribly tough task to regulate storage devices for this volatile development of digital information. Information reduction has developed into terribly important drawback. Deduplication moves toward plays an important role to get rid of redundancy in massive scale cluster massive information storage. Existing deduplication strategies don't work effectively in several things. Overlapping and slicing formula is employed for deduplication method in existing system absorb with high memory with a lot of time interval. Recently, the info deduplication cluster has matured to be a significant want of most profitable and investigate backup systems. Information deduplication cluster become accepted in storage system for information backup and archiving. Several researchers specialize in deduplication cluster by that to cut back alternative redundant information. Particularly pattern matching deduplication cluster becomes well-liked. We have a tendency to projected multi graph pattern matching formula (MGPMA) in reduplication in massive information with higher potency. The technique of mixing similarity with neighborhood is applying to the deduplication cluster with bloom filter. As associate economical information removal move toward it exploits information redundancy. As a result, deduplication systems improve storage consumption whereas reducing time delay. Finally, the experimentation shows the system have a decent performance.

References

[1] Yucheng Zhang, Dan Feng,Hong Jiang, Wen Xia, Min Fu’A Fast Asymmetric Extremum Content Defined Chunking Algorithm for Data Deduplication in Backup Storage Systems’ IEEE Transaction on Computers, July 2016

[2] Salim El Rouayheb ‘Synchronization and Deduplication in Coded Distributed Storage Networks’ IEEE/ACM Transactions on Networking, December 2015

[3] Xindong Wu, Xingquan Zhu, Gong-Qing Wu, and Wei Ding ‘Data Mining with Big Data’ IEEE Transaction on Knowledge and Data Engineering, Jan-2014

[4] Franc¸ois Goasdoue´ and Marie-Christine Rousset ‘Robust Module-Based Data Management’ IEEE Transaction on Knowledge and Data Engineering, March-2013

[5] Ekaterini Ioannou and Minos Garofalakis ‘Query Analytics over Probabilistic Databases with Unmerged Duplicates’ IEEE Transaction on Knowledge and Data Engineering,Feburary-2015

[6] Guanfeng Liu, Kai Zheng, Yan Wang, Mehmet A. Orgun, An Liu, Lei Zhao, and Xiaofang Zhou ‘Multi-Constrained Graph Pattern Matching in Large-Scale Contextual Social Graphs’ IEEE International Conference,April-2015

[7] Wenfei Fan, Jianzhong Li, Jizhou Luo, Zijing Tan, Xin Wang, Yinghui Wu ‘Incremental Graph Pattern Matching’ ACM SIGMOD International Conference on Management of data, June 2011

[8] Guanfeng Liu, Kai Zheng, Yan Wang, Mehmet A. Orgun, An Liu, Lei Zhao ‘Multi-Constrained Graph Pattern Matching in Large-Scale Contextual Social Graphs’ IEEE International Conference, April-2015

[9] Guilherme Dal Bianco, Renata Galante, Marcos Andr_e Gonc¸alves, Sergio Canuto and Carlos A. Heuser ‘A Practical and Effective Sampling Selection Strategy for Large Scale Deduplication’ IEEE Transaction on Knowledge and Data Engineering, September 2015

[10] Arindam Banerjee Chase Krumpelman, Sugato Basu Raymond J. Mooney ‘Model based Overlapping Clustering’ ACM International Conference on Knowledge Discovery and Data Mining, August 2015.

[11] Vina M. Lomte, Hemlata B. Deorukhakar ‘Review of Slicing Approach: Data Publishing with Data Privacy and Data Utility’ International Journal of Science and Research (IJSR),June 2014

[12] S. Indirakumari, A. Thilagavathy “A Secure Verifiable Storage Deduplication Scheme on Bigdata in Cloud”- International Journal of Scientific Research in Computer Science, Engineering and Information Technology –April 2017

[13] P. Balasubhramanyam Reddy, G. Nagappan ‘A Survey on Secure Cloud Storage with Techniques Like Data Deduplication and Convergent Key management’- International Journal of Scientific Research in Computer Science, Engineering and Information Technology –August 2016

Downloads

Published

2025-11-13
CITATION
DOI: 10.26438/ijcse/v6i5.654658
Published: 2025-11-13

How to Cite

[1]
S. A. N. Doss and P. Jeevitha, “Large Scale Deduplication Analysis Using Multigraph Pattern Matching Algorithm”, Int. J. Comp. Sci. Eng., vol. 6, no. 5, pp. 654–659, Nov. 2025.

Issue

Section

Research Article