Information Retrieval System Using Vector Space Model for Document Summarization
Keywords:
Vector space model, Document frequency, Term Frequency, Document contextAbstract
Document summarization is the process of reducing size of text document and that retains the most important content of the original document into the reduced document(Summary).In recent year there are huge work has been done in document summarization. There are various techniques available for document summarization but most of the techniques used similarity of sentences to extract sentence, in the document summarization a context of the document are important, so our current method used term indexing model to gives index to document as well as sentences in that document. In this proposed system we used context based document indexing based on vector space model. This document indexing model works with document frequency (DF) and term frequency (TF).DF and TF model gives document indexing weight which is used for document summarization. We compare our system with traditional term based indexing model and will prove that our system gives better result than this system.
References
X. Wan and J. Xiao, “Exploiting Neighborhood Knowledge for Single Document Summarization and Keyphrase Extraction,” ACM Trans. Information Systems, vol. 28, pp. 8:1-8:34, http://doi.acm.org/10.1145/1740592.1740596, June 2010.
K.S. Jones, “Automatic Summarising: Factors and Directions,” Advances in Automatic Text Summarization, pp. 1-12, MIT Press, 1998.
L.L. Bando, F. Scholer, and A. Turpin, “Constructing Query- Biased Summaries: A Comparison of Human and System Generated Snippets,” Proc. Third Symp. Information Interaction in Context, pp. 195-204, http://doi.acm.org/10.1145/1840784. 1840813, 2010.
X. Wan, “Towards a Unified Approach to Simultaneous Single- Document and Multi-Document Summarizations,” Proc. 23rd Int’l Conf. Computational Linguistics, pp. 1137-1145, http://portal. acm.org/citation.cfm?id=1873781.1873909, 2010.
X. Wan, “An Exploration of Document Impact on Graph-Based Multi-Document Summarization,” Proc. Conf. Empirical Methods in Natural Language Processing, pp. 755-762, http://portal.acm.org/ citation.cfm?id=1613715.1613811, 2008.
Q.L. Israel, H. Han, and I.-Y. Song, “Focused Multi-Document Summarization: Human Summarization Activity vs. Automated Systems Techniques,” J. Computing Sciences in Colleges, vol. 25, pp. 10-20, http://portal.acm.org/citation.cfm?id=1747137. 1747140, May 2010.
C. Shen and T. Li, “Multi-Document Summarization via the Minimum Dominating Set,” Proc. 23rd Int’l Conf. Computational Linguistics, pp. 984-992, http://portal.acm.org/citation.cfm?id= 1873781.1873892, 2010.
X. Wan and J. Yang, “Multi-Document Summarization Using Cluster-Based Link Analysis,” Proc. 31st Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 299-306, http://doi.acm.org/10.1145/1390334.1390386, 2008.
D. Wang, T. Li, S. Zhu, and C. Ding, “Multi-Document Summarization via Sentence-Level Semantic Analysis and Symmetric Matrix Factorization,” Proc. 31st Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 307-314, http://doi.acm.org/10.1145/1390334.1390387, 2008.
S. Harabagiu and F. Lacatusu, “Using Topic Themes for Multi- Document Summarization,” ACM Trans. Information Systems, vol. 28, pp. 13:1-13:47, http://doi.acm.org/10.1145/1777432.1777436, July 2010.
H. Daume´ III and D. Marcu, “Bayesian Query-Focused Summarization,” Proc. 21st Int’l Conf. Computational Linguistics and the 44th Ann. meeting of the Assoc. for Computational Linguistics, pp. 305-312, http://dx.doi.org/10.3115/1220175.1220214, 2006.
D.M. Dunlavy, D.P. O’Leary, J.M. Conroy, and J.D. Schlesinger, “QCS: A System for Querying, Clustering and Summarizing Documents,” Information Processing and Management, vol.43, pp.1588-1605, http://portal.acm.org/citation.cfm?id=1284916.
, Nov. 2007.
R. Varadarajan, V. Hristidis, and T. Li, “Beyond Single-Page Web Search Results,” IEEE Trans. Knowledge and Data Eng., vol. 20, no. 3, pp. 411-424, Mar. 2008.
L.-W. Ku, L.-Y. Lee, T.-H. Wu, and H.-H. Chen, “Major Topic Detection and Its Application to Opinion Summarization,” Proc. 28th Ann. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 627-628, http://doi.acm.org/10.1145/ 1076034.1076161, 2005.
E. Lloret, A. Balahur, M. Palomar, and A. Montoyo, “Towards Building a Competitive Opinion Summarization System: Challenges and Keys,” Proc. Human Language Technologies: The 2009 Ann. Conference of the North Am. Ch. Assoc. for Computational Linguistics, Companion Vol. : Student Research Workshop and Doctoral Consortium, pp. 72-77, http://portal.acm.org/citation.cfm?id= 1620932.1620945, 2009.
J.G. Conrad, J.L. Leidner, F. Schilder, and R. Kondadadi, “Query- Based Opinion Summarization for Legal Blog Entries,” Proc. 12th Int’l Conf. Artificial Intelligence and Law, pp. 167-176, http://doi.acm.org/10.1145/1568234.1568253, 2009.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
