Department of Electrical & Computer Engineering Signal and Image Laboratory (SaIL) The University of Arizona®

Past Research

Variable-Length Vocal Tract Modeling for Speech Synthesis

Student: Siddharth Mathur

Modeling of the human vocal tract an essential element in many speech synthesis systems. The Kelly-Lochbaum model uses fixed-length tubes of different cross-sectional areas to approximate the vocal tract. Because the length of each tube is closely tied to the sampling frequency, the total length of the tract cannot be changed dynamically without changing the sampling frequency. A fractional-delay filter is used for bandlimited interpolation between samples. In conjunction with the digital waveguide model of the vocal tract, such filters can be used to efectively lengthen individual tube lengths, while keeping the sampling frequency constant. In this project, various extensions to the Kelly-Lochbaum model were investigated, with the goal of obtaining more realistic speech synthesis.

Publications:

  1. Siddharth Mathur, Brad H. Story, and Jeffrey J. Rodriguez, "Vocal-Tract Modeling: Fractional Elongation of Segment Lengths in a Waveguide Model with Half-Sample Delays," IEEE Trans. on Speech and Audio Processing, vol. 14, no. 5, Sept. 2006, pp. 1754-1762. [ PDF ]

 1230 E. Speedway Blvd., P.O. Box 210104, Tucson, AZ 85721-0104
 ©2014 All Rights Reserved.  
 Contact webmaster                                  
Think ECE!