Virtual Acoustics: Localization Model & Numeric Simulations (LocaPhoto)
Localization of sound sources is an important task of the human auditory system and much research effort has been put into the development of audio devices for virtual acoustics, i.e. the reproduction of spatial sounds via headphones. Even though the process of sound localization is not completely understood yet, it is possible to simulate spatial sounds via headphones by using headrelated transfer functions (HRTFs). HRTFs describe the filtering of the incoming sound due to head, torso and particularly the pinna and thus they strongly depend on the particular details in the listener's geometry. In general, for realistic spatialsound reproduction via headphones, the individual HRTFs must be measured. The currently available acoustic measurement is a technicallycomplex process, involves placing microphones into the listener's ears, and lasts for tens of minutes.
As a first step towards an easily accessible method for virtual acoustics, we visually scanned listeners' heads, obtained a geometrical representation of the heads (3D mesh), and calculated HRTFs using numerical algorithms (part of the FWFproject P18401B15). While our results showed that the numerical method is generally capable to calculate HRTFs, several questions remained open, especially in the interpretation from the perceptual perspective. The perceptual evaluation of the effect of the systematic parameter variation turned out to be a challenging task because of the large number of tests conditions and the lengthy psychoacoustic tests. It became clear that a soundlocalization model predicting the performance is required to reduce the number of conditions for further behavioral evaluations in experiments. In LocaPhoto, we propose to develop such a soundlocalization model and to continue our work on numerical HRTF calculations.
Thus the first goal of LocaPhoto is to create a functional localization model which is able to predict localization performance for sound sources positioned in 3D free field. The model will include recent neurophysiological findings and is aimed as a tool for further research on modeling the mechanisms behind spatial hearing by the scientific community outside of LocaPhoto.
In LocaPhoto, the model will be used to evaluate HRTFs from the perceptual perspective. This will allow to achieve the second goal of LocaPhoto: the development of a method to numerically calculate perceptually valid HRTFs based on the visually retrieved geometry of a listener. Based on the results from our previous project, we plan to simplify the geometry acquisition and to increase the mesh quality by generating 3D meshes from 2D photos with photogrammetricreconstruction algorithms. Given the huge amount of parameters in the numerical calculations, we expect hundreds of calculated HRTF sets. The localization model will be used to select the promising HRTF candidates, all of which will be subjectively tested in localization experiments.
Predictions based on the localization model from Langendijk and Bronkhorst (2002). Left panel: incoming sound with frequencies up to 16 kHz. Middle panel: incoming sound with frequencies up to 8 kHz. Right panel: incoming sound as for the middle panel, the prediction improved by modifying the model to consider the incoming signal bandwidth only. Bright colors denote regions where the model predicts sound localization, dots denote the actual sound localization performance of the subject measured in a psychoacoustic experiment.  

Mesh from Kreuzer et al. (2009) 

LEFT: The amplitude spectra of the measured (top) and calculated (bottom) HRTFs as a function of elevation angle. TOP: Results of a preliminary sound localization experiment with the corresponding listener; Measured: acoustically measured DTFs; smoothed: measured and spectrally smoothed DTFs (only first 32 cepstral coefficients used, compare Kulkarni and Colburn (1998)); Strongly smoothed: only the first 8 cepstral coefficients used; Calculated: calculated DTFs (from the left panel); KEMAR: DTFs of a dummy head Gardner and Martin (1995). 
Results of a preliminary sound localization experiment with the corresponding listener; Measured: acoustically measured DTFs; smoothed: measured and spectrally smoothed DTFs (only first 32 cepstral coefficients used, compare Kulkarni and Colburn (1998)); Strongly smoothed: only the first 8 cepstral coefficients used; Calculated: calculated DTFs (from the left panel); KEMAR: DTFs of a dummy head Gardner and Martin (1995).  

LEFT & TOP: Photogrammetric reconstruction of a person (preliminary). 