VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription

Authors

Kunal Kashyap Undergraduate student, CSE Department, ADGIPS, New Delhi, India https://orcid.org/0009-0001-1475-6832
Prashant Singh Undergraduate student, CSE Department, ADGIPS, New Delhi, India https://orcid.org/0009-0008-1153-1276
Anjali Verma Undergraduate student, CSE Department, ADGIPS, New Delhi, India https://orcid.org/0009-0004-9174-8731
Satya Mishra Undergraduate student, CSE Department, ADGIPS, New Delhi, India https://orcid.org/0009-0007-6398-7921
Prachi Goel CSE Department, ADGIPS, New Delhi, India https://orcid.org/0009-0007-7576-9112

DOI:

https://doi.org/10.26438/ijcse/v11i12.2125

Keywords:

Sound transcription, Sound translation, AI, Deep learning, Real-time, Language barrier

Abstract

In this world of digitalized communication, effective communication crosses regional boundaries and linguistic obstacles making the world more connected. The demand for seamless multilingual communication has never been more important as corporations, institutions, and individuals engage on a worldwide scale. This article explores a trailblazing initiative that uses real-time translation and transcription services offered by Whisper-AI to transform the world of video conferences. The goal of the research is to create an AI model that easily interfaces a translation and transcription-based model to work in a real-time video conferencing system. Participants may converse in real-time without any language barriers by utilizing cutting-edge voice recognition and translation technologies.

References

[1] A. Radford, J.W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust Speech Recognition via Large- Scale Weak Supervision”, arXivpreprint, arXiv:2212.04356, 2022. doi10.48550/ arXiv.2212.04356

[2] E. Cho, C. Fügen, T. Herrmann, K. Kilgour, M. Mediani, C. Mohr, J. Niehues, K. Rottmann, C. Saam, S. Stüker, and A. Waibel. 2013. “A real- world system for contemporaneous restatement of German lectures”, In the Proceedings of the 2013 INTERSPEECH, Lyon, France, pp.3473-3477, 2013.

[3] N. Arivazhagan, C. Cherry, I. Te, W. Macherey, P. Baljekar and G. Foster, "Re-Translation Strategies for Long Form, Simultaneous, Spoken Language Translation," ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, pp.7919-7923, 2020. doi: 10.1109/ICASSP40776.2020.9054585.

[4] Rothman and A. Gully, “Mills for Natural Language Processing” Second Edition, Packt Publishing, UK, ch. 2, 2022, ISBN 9781803247335

[5] T. Chen, W. Wang, W. Wei, X. Shi, X. Li, J. Ye and K. Knight, "DiDi’s Machine Restatement System for WMT 2020", In the Proceedings of the 2020 Workshop on Statistical Machine Translation (WMT), pp.105?112, 2020.

Downloads

PDF ⁰

Published

2023-12-31

CITATION

DOI: 10.26438/ijcse/v11i12.2125

Published: 2023-12-31

How to Cite

[1]

K. Kashyap, P. Singh, A. Verma, S. Mishra, and P. Goel, “VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription”, Int. J. Comp. Sci. Eng., vol. 11, no. 12, pp. 21–25, Dec. 2023.

Download Citation

Issue

Vol. 11 No. 12 (2023): IJCSE December Edition

Section

Research Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.

VAARTALAP: Embedding Whisper-AI-like Model into a Video-Conferencing System to Aid Real-Time Translation and Transcription

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Journal Information

UGC Gazette Regulation

Join Editorial Board

Information

Current Issue

Keywords