Extractive Incremental Multi-Document Summarization by Ranking Sentences Relevant to Key Phrase
DOI:
https://doi.org/10.26438/ijcse/v6i12.250253Keywords:
Multi Document Summarization, Extraction, Sentence RankingAbstract
The summarization deal’s with giving the concepts precisely. The multi-document summarization gives the extract of the multiple documents into summarized single document. Here we summarize the document individually by extracting the key phrase using the RAKE algorithm, which perform well on the single document and does not depend on the corpus. This enables the reader to find out the documents, which are highly related to the document by using the TextRank algorithm that ranks the sentence based on the key phrase selected from the single document and they can read the entire document without going through all. The work finds the summary from the given documents and those are ranked and the high ranked documents selected are then used as input to the documents at the next level. The information gained from the previous level (i.e. Summary from documents) are used as the input for the next phase, which will give more information.
References
[1] Cohn. T and Lapata M, “Sentence compression as tree transduction. J”, Artif. Int. Res. 34(1): 637-674, 2009.
[2] D. Koller and M. Sahami, “Hierarchically classifying documents using very few words”, Proceedings of the 14th International Conference on Machine Learning, 1998.
[3] K. Lang, “Newsweeder: Learning to filter news”, Proceedings of the 12th International Conference on Machine Learning, 331-339, 1995.
[4] D. Mladenic, “Machine Learning on non-homogeneous distributed text data”, Ph.D. thesis, University of Ljubjjana, Slovenia, 1998.
[5] Luhn HP, “The automatic creation of literature abstracts”, IBM Journal of Research and Development, 159-165, 1958.
[6] Vanderwende L, Suzuki H, Brockett, C and Nenkova A, “Beyond sumbasic: Task-focused summarization with sentence simplification and lexical expansion”, Information processing and Management 43(6), 1606-1618, 2007.
[7] Canhasi E and Kononenko I, “Weighted archetypal analysis of the multi element graph for query focused multi-document summarization”, Expert systems with Applications 41(2), 535-543, 2014.
[8] Ferreira R, de Souza Cabral L, Freitas F, Lins R D, de Franca Silva G, et al, “A multi document summarization system based on statistics and linguistic treatment”, Expert systems with Applications 41(13), 5780-5787, 2014.
[9] Glavas G and Sanjeder J, “Event graphs for information retrieval and multi-document summarization”, Expert systems with Applications 41(15), 6904-6916, 2014.
[10] Zhao L, Wu L and Huang X, “Using query expansion in graph-based approach for query focused multi-document summarization”, Information Processing & Management, 45(1), 35-41, 2009.
[11] T. Joachims, “A Probablistic analysis of the Rocchio algorithm with TF-IDF for text categorization”, International Conference on Machine Learning, 1997.
[12] B. Choi and X. Peng, “Dynamic and hierarchical classification of web pages”, Online Information Review, 28(2), 139-147, 2004.
[13] M.A.Fattah, F.Ren, “GA,MR, FFNN, PNN and GMM based models for automatic text summarization”, Comput. Speech Lang, 23 (1), 126-144, 2009.
[14] M.D. Gordon, “Probabilistic and genetic algorithms for document retrieval”, Commun. ACM 31 (10), 1208-1218, 1988.
[15] Y.X. He, D.X. Liu, D.H. Ji, H.Yang, C.Teng, “Msbga: A multi-document summarization system based on genetic algorithm”, Machine learning and Cybernetics, 2006 International Conference on IEEE, August, PP. 2659-2664, 2006.
[16] M.S.Binwahlan, N.Salim, L.Suanmali, “Swarm based text summarization”, Computer Science and information Technology-Spring Conference, IACSITSC’09, International Association of, IEEE, April, PP.145-150, 2009.
[17] R.Rautray, R.C.Balabantaray, A. Bhardwaj, “Document summarization using sentence features”, Int.J. Inf. Retrieval Res. (IJIRR) 5(1), 36-47, 2015.
[18] R.M. Aliguliyev, “A new sentence similarity measure and sentence based extractive technique for automatic text summarization”, Expert Syst. Appl.36 (4), 7764-7772, 2009.
[19] R.M.Alguliev, R.M.Aliguliyev, N.R.Isazade, “CDDS: Constraint – driven document summarization models”, Expert Syst. Appl. 40 (2), 458 – 465, 2013.
[20] R.M.Alguliev, R.M.Aliguliyev, C.A. Mehdiyev, “Sentence selection for generic document summarization using an adaptive differential evolution algorithm”, Swarm Evolutionary Comput. 1(4), 213-222, 2011.
[21] R.Rautray, R.C.Balabantaray, “Comparative study of DE and PSO over document summarization”, Intelligent Computing, Communication and Devices, Springer India, PP. 1-5, 2015.
[22] R.M.Alguliev, R.M.Aliguliyev, M.S. Hajirahimova, C.A. Mehdiyev, “MCMR: Maximum coverage and minimum redundant text summarization model”, Expert Syst. Appl. 38 (12), 14514 – 14522, 2011.
[23] S.L. Patil, K.P.Adhiya, “Textual Similarity Detection from Sentence”, International Journal of Computer Sciences and Engineering, Sep, PP.835-839, 2018.
[24] B. Batra, S. Sethi, A.Dixit, “Improved Text Summarization Method for Summarizing Product Reviews”, International Journal of Computer Sciences and Engineering, Sep, PP.113-122, 2018.
[25] C.Y. Lin, E.Hovy, “Automatic evaluation of summaries using n-gram co-occurrence statistics”, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology – Volume 1, Association for Computational Linguistics, May, PP.71-78, 2003.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
