A Rule Based Prosody Model for Turkish Text-To Synthesis

Uslu, Ibrahim Baran; Ilk, Hakki Gokhan; Yilmaz, Asim Egemen

A Rule Based Prosody Model for Turkish Text-To Synthesis

Date

2013

Authors

Uslu, Ibrahim Baran

Ilk, Hakki Gokhan

Yilmaz, Asim Egemen

Publisher

Univ Osijek, Tech Fac

Organizational Units

Organizational Unit

Department of Electrical & Electronics Engineering

Department of Electrical and Electronics Engineering (EE) offers solid graduate education and research program. Our Department is known for its student-centered and practice-oriented education. We are devoted to provide an exceptional educational experience to our students and prepare them for the highest personal and professional accomplishments. The advanced teaching and research laboratories are designed to educate the future workforce and meet the challenges of current technologies. The faculty's research activities are high voltage, electrical machinery, power systems, signal and image processing and photonics. Our students have exciting opportunities to participate in our department's research projects as well as in various activities sponsored by TUBİTAK, and other professional societies. European Remote Radio Laboratory project, which provides internet-access to our laboratories, has been accomplished under the leadership of our department with contributions from several European institutions.

Abstract

This paper presents our novel prosody model in a Turkish text-to-speech synthesis (TTS) system. After developing a TTS system driven by parametric features consisting of duration, pitch and energy modifications, we try to figure out some prosody rules in order to increase the naturalness of our synthesizer. Since the inflected verbs in Turkish can be stand-alone sentences with the suffixes they take, we build a perceptual prosody model by defining rules on the stress patterns of verb inflections. Affirmative, negative and interrogative (both positive and negative) forms of many verbs were examined in a systematic way. Not only verbs, but in the same way, some phrases were examined for obtaining a proper prosody. According to the results of listening tests, the defined rules based on duration, pitch and energy modification weights, result in perceptually better speech synthesis, namely about 1,78/5,0 improvement in average in the CMOS (Comparative Mean Opinion Score) test. This improvement shows the success of our novel prosody model.

Description

ILK, Hakki Gokhan/0000-0003-4365-8286

ORCID

ILK, Hakki Gokhan

Keywords

CMOS test, diphone, natural speech, prosody, PSOLA, text-to-speech synthesis (TTS), verb inflection

WoS Q

Q4

Scopus Q

Q3

Source

Tehnicki Vjesnik

Volume

20

Issue

2

Start Page

217

End Page

223

URI

https://hdl.handle.net/20.500.14411/8607

Collections

Scopus
WoS

Full item page

A Rule Based Prosody Model for Turkish Text-To Synthesis

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Open Access Color

OpenAIRE Downloads

OpenAIRE Views

Research Projects

Organizational Units

Journal Issue

Events

Abstract

Description

ORCID

Keywords

Turkish CoHE Thesis Center URL

Fields of Science

Citation

WoS Q

Scopus Q

Source

Volume

Issue

Start Page

End Page

URI

Collections