Estimation of vocal tract as two tubes model or three tubes model


This is a trial estimation of vocal tract as simple tube model, two tubes model or three tubes model, using several peak and drop-peak frequencies of vowel voice, by precomputed grid search and downhill simplex method.

Two tubes model

Two tubes model is a very simplified model of vocal tract.
As following figure, two tubes model is consisted of two tubes: one is length L1 cross-section area A1 and another is length L2 cross-section area A2. This simplified model can't reproduce human voice completely, however, it's useful to understand the feature roughly.


Reflection coefficient is defined by two cross-section area A1 and A2. Estimation of two tubes model is identified by 3 parameters, L1, L2, and r1, matching with the observed signal.
Three tubes model is a model in which one tube is added to two tubes model. Three tubes model is identified by 5 parameters, L1, L2, L3, r1, and r2.

Estimation method


Following figure is an example of frequency response of vowel voice based on LPC analysis.
Red dot is peak (this is sometime called formant frequency) and blue dot is local bottom (here, it's called as drop-peak).
Estimation is to find suitable parameters whose peak and drop-peak frequencies value are as close as possible, comparing frequency response of real voice and two tubes model or three tubes model.



In general, it's difficult to uniquely determine the shape of vocal tract from voice.
So, here, following constraint conditions are placed on.

Estimation is performed in two steps: a grid search that computes representative grid points using precomputed data in advance, and then detailed search that uses the candidate of grid search as an initial value.

Estimation result

Following figures are estimation result for vowel voice when uttered alone.
Upper is comparison of frequency response by tubes model (blue color) with frequency response by real voice (green color). Lower shows length and cross-section area of the tube model.

Example of estimation two tubes model for vowel voice /a/

Example of estimation two tubes model for vowel voice /i/

Example of estimation two tubes model for vowel voice /u/

Example of estimation two tubes model for vowel voice /e/

Example of estimation three tubes model for vowel voice /o/




For reference, there is Python program to compute above by Python. Please see README.txt in the zip file about usage.


No.1b, 26 March 2019

Home page