Speech Synthesis of Emotions in a Sentence using Vowel Features

Authors
Rintaro Makino1, Yasunari Yoshitomi2, *, Taro Asada2, Masayoshi Tabuse2
1UX Solution Department UX Division, Customer Platform Promotion Division, SoftBank Corp., 1-9-1 Higashi-shimbashi, Minato-ku, Tokyo 105-7317, Japan
2Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, 1-5 Nakaragi-cho, Shimogamo, Sakyo-ku, Kyoto 606-8522, Japan
*Corresponding author. Email: [email protected]
Corresponding Author
Yasunari Yoshitomi
Received 14 November 2019, Accepted 28 April 2020, Available Online 2 June 2020.
DOI
https://doi.org/10.2991/jrnal.k.200528.007
Keywords
Emotional speech; feature parameter; emotional synthetic speech; vowel; sentence
Abstract
We previously proposed a method for adding emotions to synthetic speech using the vowel features of a speaker. For the initial investigation in this earlier study, we used utterances of Japanese names to demonstrate the method. In the present study, we use the proposed method to construct emotional synthetic speech for a sentence formed from the emotional speech of a single male subject and produce results that are discriminable with good accuracy.
Copyright
© 2020 The Authors. Published by ALife Robotics Corp. Ltd.
Open Access
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)