Transcripter-Generation of the transcript from audio to text using Deep Learning
DOI:
https://doi.org/10.26438/ijcse/v7i1.770773Keywords:
Neural Network, Audio extraction, Speech recognition, Time synchronization, Automatic Transcript generation, Natural language processing,, Connectionist Temporal Classification (CTC), Hidden Markov Model (HMM)Abstract
A video is the most powerful medium in the propagation of information and important part of the video for exchanging the information is audio, which is an important aspect of the video on which the whole message depends and as it is used in all field like Teaching, Entertainment, Conference Meeting, News Broadcast. So converting the Audio into Text in Documented format make easy for referring purpose as it is difficult to search the said word in the video as compared to the transcript. The main objective of developing this system is to present an automated way to generate the transcript for audio and video. As it is not possible to make the same informative video in all Languages. So this the place where our System plays an important role. It will extract the audio from the given video and transcript is generated based on which it can be translated into any desired language. It can be very useful for people who speak the language which is not used by the majority of the population. In this way, it has much application in all field where information exchange is happening based on Video.
References
[1]Houssem chamber, Marlon Oliveira, Kevin McGuinness, Suzanne Little, Keisuke Kameyama “Educational video classification by using a transcript to image transform and supervised learning.”
[2] Tatsuya Kawahara, Yusuke Nemoto, Yuka Akita. “Automatic lecture Transcription by exploiting presentation slide information for Language Model Adaption.”May 2008 IEEE
[3] Wai Fong Chua, “Teaching and learning only the language of numbers—monolingualism in a multilingual world, ” Critical Perspectives on Accounting, Vol. 7, No. 1, pp. 129-156 February 1996, ISSN 1045-2354, http://dx.doi.org/10.1006/cpac.1996.0019.
[4] G. Nowak, S. Grabowski, C. Draus, D. Zarebski and W. Bieniecki, “Designing a computer-assisted translation system for multi-lingual catalog and advertising brochure translations,” in Proc. of Sixth International Conference on Perspective Technologies and Methods in MEMS Design pp.175- 180, 20-23 April 2010.
[5] Houssem chamber, Marlon Oliveira, Kevin McGuinness, Suzanne Little, Keisuke Kameyama, Paul Kwan, Alistair Sutherland, “Educational video classification by using a transcript to image transform and supervised learning.” 13-15 July 2016
[6] J. Santos and J. Nombela, “Text-to-speech conversion in Spanish a complete rule-based synthesis system,” in Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.7, pp.1593-1596, May 1982.
[7] F. Y. Sadeque, S. Yasar and M. M. Islam, “Bangla text to speech conversion: A syllabic unit selection approach,” in Proc. of International Conference on Informatics, Electronics & Vision (ICIEV), pp.1-6, 17-18 May 2013.
[8] Xia Linsi, N. Yamashita and Toru Ishida, “Analysis on Multilingual Discussion for Wikipedia Translation,” in Proc. of Second International Conference on Culture and Computing, pp.104-109, 20-22 Oct. 2011.
[9] Tatsuya Kawahara, Yusuke Nemoto, Yuka Akita, “Automatic lecture Transcription by exploiting presentation slide information for Language Model Adaption.” August 2008.
[10] Nikolas Lee, Jia Wern Yong,”Automated Transcript Generation for the video conferences” 11-Nov 2017
[11] S. Peitz, M. Freitag, A. Mauser, and H. Ney, “Modeling punctuation prediction as machine translation,” in Proceedings of the International Workshop on Spoken Language Translation, 2011.
[12] F. Batista, H. Moniz, I. Trancoso, and N. Mamede, “Bilingual experiments on automatic recovery of capitalization and punctuation of automatic speech transcripts,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, pp. 474–485, 2012.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.
