Enzinger, E. (2010): Characterizing Formant Tracks in Viennese Diphthongs for Forensic Speaker Comparison, in: Proceedings of the AES 39th International Conference - Audio Forensics. Hillerød, Denmark, 47-52. (inproceedings)


    This study evaluates methods that capture time-dynamic properties of diphthongs produced by speakers of Viennese German for application in a forensic setting. Polynomials, discrete cosine transform and B-splines along with experimental features based on bent-cable regression models were used to characterise the first three formant tracks of two /aE/ diphthong segments. The resulting coefficients were in turn used as parameters in a speaker discrimination procedure based on likelihood ratios which were calculated using a multi-variate kernel density formula (MVKD). A comparison of the achieved performance based on cross-validation is presented in terms of equal error rate (EER) and the log-likelihood ratio cost metric as well as DET and Tippett plots.