How To Create Custom Text-to-speech Engine
Solution 1:
I am also interested in making my tts engine. Here are some information I've found. On this link you can find a brief description what you have to do to make your tts engine for android. Since API level 14 there is abstract class for tts engine implementation. More on link.
But making conversion from text to speech isn't so easy. Some basic information what tts engine should implement can be found on wikipedia.
Solution 2:
As far as my research goes the best architecture for making a TTS engine currently is Tacotron 2[Paper here], a neural network architecture for speech synthesis directly from text (can easily capture via OCR). It has achieved a MOS(mean opinion score) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech. The official implementation of Tacotron 2 is not public but there is a tensorflow implementation made using tensorflow 1.15.0 here. There is also a pytorch implementation by nvidia here which is more currently maintained. Both implementations can be retrained using dataset for a new language(language with no TTS implementation yet) for easy implementation of a TTS engine. You can also use the architectures above as a stepping stone to build your own architecture.
Post a Comment for "How To Create Custom Text-to-speech Engine"