VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription

Authors

DOI:

https://doi.org/10.26438/ijcse/v11i12.2125

Keywords:

Sound transcription, Sound translation, AI, Deep learning, Real-time, Language barrier

Abstract

In this world of digitalized communication, effective communication crosses regional boundaries and linguistic obstacles making the world more connected. The demand for seamless multilingual communication has never been more important as corporations, institutions, and individuals engage on a worldwide scale. This article explores a trailblazing initiative that uses real-time translation and transcription services offered by Whisper-AI to transform the world of video conferences. The goal of the research is to create an AI model that easily interfaces a translation and transcription-based model to work in a real-time video conferencing system. Participants may converse in real-time without any language barriers by utilizing cutting-edge voice recognition and translation technologies.

References

[1] A. Radford, J.W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust Speech Recognition via Large- Scale Weak Supervision”, arXivpreprint, arXiv:2212.04356, 2022. doi10.48550/ arXiv.2212.04356

[2] E. Cho, C. Fügen, T. Herrmann, K. Kilgour, M. Mediani, C. Mohr, J. Niehues, K. Rottmann, C. Saam, S. Stüker, and A. Waibel. 2013. “A real- world system for contemporaneous restatement of German lectures”, In the Proceedings of the 2013 INTERSPEECH, Lyon, France, pp.3473-3477, 2013.

[3] N. Arivazhagan, C. Cherry, I. Te, W. Macherey, P. Baljekar and G. Foster, "Re-Translation Strategies for Long Form, Simultaneous, Spoken Language Translation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pp.7919-7923, 2020. doi: 10.1109/ICASSP40776.2020.9054585.

[4] Rothman and A. Gully, “Mills for Natural Language Processing” Second Edition, Packt Publishing, UK, ch. 2, 2022, ISBN 9781803247335

[5] T. Chen, W. Wang, W. Wei, X. Shi, X. Li, J. Ye and K. Knight, "DiDi’s Machine Restatement System for WMT 2020", In the Proceedings of the 2020 Workshop on Statistical Machine Translation (WMT), pp.105?112, 2020.

Downloads

Published

2023-12-31
CITATION
DOI: 10.26438/ijcse/v11i12.2125
Published: 2023-12-31

How to Cite

[1]
K. Kashyap, P. Singh, A. Verma, S. Mishra, and P. Goel, “VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription”, Int. J. Comp. Sci. Eng., vol. 11, no. 12, pp. 21–25, Dec. 2023.