Modelling of Lithuanian Speech Diphthongs

Pyž, Gražina; Šimonytė, Virginija; Slivinskas, Vytautas

Modelling of Lithuanian Speech Diphthongs

Article type: Research Article

Authors: Pyž, Gražina | Šimonytė, Virginija | Slivinskas, Vytautas

Affiliations: Vilnius University Institute of Mathematics and Informatics, Akademijos 4, LT-08663 Vilnius, Lithuania, e-mail: grazinute123@gmail.com | Vilnius Pedagogical University, Faculty of Mathematics and Informatics, Studentų 39, LT-08106 Vilnius, Lithuania, e-mail: virginija.simonyte@vpu.lt, vytautas.slivinskas@vpu.lt

Abstract: The goal of the paper is to get a method of Lithuanian speech diphthong modelling. We use a formant-based synthesizer for this modelling. The second order quasipolynomial has been chosen as the formant model in time domain. A general diphthong model is a multi-input and single-output (MISO) system, that consists of two parts where the first part corresponds to the first vowel of the diphthong and the second one – to the other vowel. The system is excited by semi-periodic impulses with a smooth transition from one vowel to the other. We derived the parametric input-output equations in the case of quasipolynomial formants, defined a new notion of the convoluted basic signal matrix, derived parametric minimization functional formulas for the convoluted output data. The new formant parameter estimation algorithm for convoluted data, based on Levenberg–Marquardt approach, has been derived and its stepwise form presented. Lithuanian diphthong /ai/ was selected as an example. This diphthong was recorded with the following parameters: PCM 48 kHz, 16 bit, stereo. Two characteristic pitches of the vowels /a/ and /i/ have been chosen. Equidistant samples of these pitches have been used for estimating parameters of MISO formant models of the vowels. Transition from the vowel /a/ to the vowel /i/ was achieved by changing excitation impulse amplitudes by the arctangent law. The method was audio tested, and the Fourier transforms of the real data and output of the MISO model have been compared. It was impossible to distinguish between the real and simulated diphthongs. The magnitude and phase responses only have shown small differences.

Keywords: Lithuanian diphthongs, modelling, MISO system, Levenberg–Marquardt approach, formant, quasipolynomial model, parameter estimation, speech synthesis

Journal: Informatica, vol. 22, no. 3, pp. 411-434, 2011

Received January 2011

Accepted May 2011

Published: 2011

Price: EUR 27.50

North America

IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA

Tel: +1 703 830 6300
Fax: +1 703 830 2300
sales@iospress.com

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

Europe

IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands

Tel: +31 20 688 3355
Fax: +31 20 687 0091
info@iospress.nl

For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office info@iospress.nl

Asia

Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China

Free service line: 400 661 8717
Fax: +86 10 8446 7947
china@iospress.cn

For editorial issues, like the status of your submitted paper or proposals, write to editorial@iospress.nl

如果您在出版方面需要帮助或有任何建, 件至: editorial@iospress.nl

Share this:

North America

Europe

Asia