Using Proximity and Semantic Similarity in Question Answering
DOI:
https://doi.org/10.26438/ijcse/v6si10.917Keywords:
Question-Answering, Proximity, Semantic Similarity, Natural Language Processing and SynonymsAbstract
This paper deals with the process of Question Answering, using news articles crawled from ‘THE HINDU’ newspaper website of the year 2017. We make use of corpus of close to 10,000 articles/documents crawled categorically into Sports, Science and Tech., Business and Entertainment. We have implemented a system that extracts documents based on relevance to the question a user asks through the tf-idf ranking. For the processing phase, we made use of methods initially implemented for simpler systems, such as document extraction and checking sentence similarity between two short sentences. We managed to implement the techniques to extract coherent answers by extracting the passages with the best likelihood of containing the answer and the process these passages for the answer based on their similarity with the question. To implement these, we have made use of various Natural Language Processing (NLP) techniques along with the Wordnet knowledge base. We have tested the system with different corpus sizes and different coefficient of cosine similarity to explore this technique.
References
[1] Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett, “Sentence Similarity Based on Semantic Nets and Corpus Statistics”, in IEEE Transactions on Knowledge and Data Engineering, VOL. 18, NO. 8, AUGUST 2006.
[2] Daniel Jurafsky & James H. Martin, “Speech and Language Processing”,
[3] Man-Hung Jong, Chong-Han Ri, Hyok-Chol Choe, Chol-Jun Hwang, A Method of Passage-Based Document Retrieval in Question Answering System. https://arxiv.org/ftp/arxiv/papers/1512/1512.05437.pdf.
[4] Apra Mishra, Santosh Vishwakarma, Analysis of TF-IDF Model and its Variant for Document Retrieval, ,2015.
[5] Vicedo, Jose Luis & Ferrández, Antonio, A Semantic approach to Question Answering systems.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
