Twee: A Novel Text-To-Speech Engine

Authors

Das D Dept. of Computer Science & Engineering, University Institute of Technology, The University of Burdwan, Golapbag (North), Burdwan- 713104, West Bengal, India
Hassan H Dept. of Computer Science & Engineering, University Institute of Technology, The University of Burdwan, Golapbag (North), Burdwan- 713104, West Bengal, India
Gupta S Dept. of Computer Science & Engineering, University Institute of Technology, The University of Burdwan, Golapbag (North), Burdwan- 713104, West Bengal, India

Keywords:

Artificial Intelligence, Natural Language Processing, Digital Signal Processing, Phoneme, Emotion

Abstract

With the advancement of technology and the widespread use of smart devices, the world has witnessed that the networking and/or the connectivity horizon has broadened to an exalted level. One of the prominent researches being undertaken in this digital era is the development of Text-to-Speech (TTS) engines; which is capable enough of offering more interactivity with the prevalent smart devices. There are various TTS engines available in the market currently, but these engines lack the capability of showing the effects of human voice e.g., they fail to provide credible indications of the sentiment, mood or emotional state of mind of the speaker etc. Further speaking, presently there is no comprehensible or consummate TTS engine that could replicate human behaviour and/or mannerisms with utmost precision and accuracy. This paper proposes a novel Text-to-Speech engine named ‘Twee’ whose pronunciation works in sync with real world human intelligence. The proposed system is an application of the interdisciplinary field of research whereby domains such as Natural Language Processing, Artificial Intelligence and Digital Signal Processing are amalgamated to perform sentiment analysis on text through the processing of phonemes. This system works well both in mono channel mode and in stereo mode and is capable of generating varied effects on a voice depending on the type of communication.

References

[1] A. Drahota, A. Costall, V. Reddy, “The Vocal Communication of Different Kinds of Smile”, Speech Communication, Vol. 50, Issue.4, pp.278-287, 2007. doi: 10.1016/j.specom.2007.10.001

[2] W.Y. Wang, K. Georgila, “Automatic Detection of Unnatural Word-Level Segments in Unit-Selection Speech Synthesis”, In the Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, Waikoloa, HI, USA, pp.289-294, 2011.

[3] R.E. Remez, P.E. Rubin, D.B. Pisoni, T.D. Carrell, “Speech Perception without Traditional Speech Cues”, Science, New Series, Vol.212, Issue.4497, pp. 947-950, 1981. doi:10.1126/science.7233191

[4] J. Zhang, “Language Generation and Speech Synthesis in Dialogues for Language Learning”, Massachusetts Institute of Technology, pp.1-68, 2004.

[5] S. Lemmetty, “Review of Speech Synthesis Technology”, Helsinki Universty of Technology, pp.1-113, 1999.

[6] I.G. Mattingly,"Speech synthesis for phonetic and phonological models", Current Trends in Linguistics. Mouton, The Hague, Vol. 12, pp.2451–2487, 1974.

[7] FFmpeg Git, "FFmpeg 4.0 "Wu"", last accessed 2018-07-18.

[8] Takanishi Lab Webpage, "Anthropomorphic Talking Robot Waseda Talker Series", Retrieved from http://www.takanishi.mech.waseda.ac.jp/top/research/voice/index.htm, last accessed 2018-10-10.

[9] Deepmind Webpage, "WaveNet: A Generative Model for Raw Audio”, Retrieved from https://deepmind.com/blog/wavenet-generative-model-raw-audio/, last accessed 2018-09-08.

Downloads

PDF ⁰

Published

2025-11-24

How to Cite

[1]

D. Das, H. Hassan, and S. Gupta, “Twee: A Novel Text-To-Speech Engine”, Int. J. Comp. Sci. Eng., vol. 7, no. 1, pp. 67–70, Nov. 2025.

Download Citation

Issue

Vol. 7 No. 1 (2019): IJCSE Special Issue Jan Edition

Section

Survey Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.

Twee: A Novel Text-To-Speech Engine

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Make a Submission

Journal Information

UGC Gazette Regulation

Join Editorial Board

Information

Current Issue

Keywords