A Study of metrics for evaluation of Machine translation

Authors

  • Sourabh K Dept. of Computer Science, GGM Science College, Jammu, India
  • Aaqib SM Dept. of Computer Science, Amar Singh Science College, Srinagar, India
  • Mansotra V Dept. of Computer Science and IT, University of Jammu, Jammu, India

Keywords:

Machine Translation, Corpus, bleu, Nist, Meteor, wer, ter, gtm

Abstract

Machine Translation has gained popularity over the years and has become one of the promising areas of research in computer science. Due to a consistent growth of internet users across the world information is now more versatile and dynamic available in almost all popular spoken languages throughout the world. From Indian perspective importance of machine translation become very obvious because Hindi is a language that is widely used across India and whole world. Many initiatives have been taken to facilitate Indian users so that information may be accessed in Hindi by converting it from one language to other. In this paper we have studied various available automatic metrics that evaluate the quality of translation correlation with human judgments.

References

Philipp Koehn, Christof Monz ,”Manual and Automatic Evaluation of Machine Translation between European Languages” School of Informatics University of Edinburgh ,Department of Computer Science Queen Mary, University of London. Proceeding StatMT '06 Proceedings of the Workshop on Statistical Machine Translation Pages 102-121

Michael Denkowski and Alon Lavie Language ,”Choosing the Right Evaluation for Machine Translation: an Examination of Annotator and Automatic Metric Performance on Human Judgment Tasks” Proceedings of the Ninth Biennial Conference of the Association for Machine Translation in the Americas https://www.cs.cmu.edu/~mdenkows/pdf/mteval-amta-2010.pdf

Matthew Snover Bonnie Dorr Richard Schwartz, Linnea Micciulla, and John Makhoul, “A Study of Translation Edit Rate with Targeted Human Annotation” Proceedings of association for machine translation in the Americas, pp 223-231.

Aditi Kalyani, Hemant Kumud Shashi Pal Singh Ajai Kumar,” Assessing the Quality of MT Systems for Hindi to English Translation” International Journal of Computer Applications (0975 – 8887) Volume 89 – No 15, March 2014

Klakow, Dietrich; Jochen Peters (September 2002). "Testing the correlation of word error rate and perplexity". Speech Communication. 38 (1-2): 19–28. doi:10.1016/S0167-6393(01)00041-3. ISSN 0167-6393

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu,” BLEU: a Method for Automatic Evaluation of Machine Translation “. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 311-318

Jason Brownlee “A Gentle Introduction to Calculating the BLEU Score for Text in Python “.November 20, 2017 in Natural Language Processing” Online https://machinelearningmastery.com/calculate-bleu-score-for-text-python/

Xingyi Song and Trevor Cohn and Lucia Specia,” BLEU deconstructed: Designing a Better MT Evaluation Metric” University of Sheffield Department of Computer Science Proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics (CICLING)

Doddington, George. (2002),”Automatic evaluation of machine translation quality using n-gram co-occurrence statistics”.138-145. 10.3115/1289189.1289273

Satanjeev Banerjee Alon Lavie ,”METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments” Institute Language Technologies Institute Carnegie Mellon University. Proceedings of Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization at the 43rd Annual Meeting of the Association of Computational Linguistics (ACL-2005), Ann Arbor, Michigan, June 2005.

Ankush Gupta and Sriram Venkatapathy and Rajeev Sangal ,” METEOR-Hindi : Automatic MT Evaluation Metric for Hindi as a Target Language”. Language Technologies Research Centre, IIIT-Hyderabad, Hyderabad, India. Proceedings of ICON-2010:8th International conference on Natural language processing, Macmillan Publishers, India.

Xinlei Chen, Hao Fang, Tsung-Yi Lin, Ramakrishna Vedantam Saurabh Gupta, Piotr Dollar, C. Lawrence Zitnick ,”Microsoft COCO Captions: Data Collection and Evaluation Server”. CoRR 2015 Vol: abs/1504.00325

Downloads

Published

2025-11-13

How to Cite

[1]
K. Sourabh, S. Aaqib, and V. Mansotra, “A Study of metrics for evaluation of Machine translation”, Int. J. Comp. Sci. Eng., vol. 6, no. 5, pp. 1–4, Nov. 2025.