Normal-hearing (NH) listeners use two binaural cues, the interaural time difference (ITD) and the interaural level difference (ILD), for sound localization in the horizontal plane. They apply frequency-dependent weights when combining them to determine the perceived azimuth of a sound source. Cochlear implant (CI) listeners, however, rely almost entirely on ILDs. This is partly due to the properties of current envelope-based CI-systems, which do not explicitly encode carrier ITDs. However, even if they are artificially conveyed via a research system, CI listeners perform worse on average than NH listeners. Since current CI-systems do not reliably convey ITD information, CI listeners might learn to ignore ITDs and focus on ILDs instead. A recent study in our lab provided first evidence that such reweighting of binaural cues is possible in NH listeners.
This project aims to further investigate the phenomenon: First, we will test whether a changed ITD/ILD weighting will generalize to different frequency regions. Second, the effect of ITD/ILD reweighting on spatial release from speech-on-speech masking will be investigated, as listeners benefit particularly from ITDs in such tasks. And third, we will test, whether CI listeners can also be trained to weight ITDs more strongly and whether that translates to an increase in ITD sensitivity. Additionally, we will explore and evaluate different training methods to induce ITD/ILD reweighting.
The results are expected to shed further light on the plasticity of the binaural auditory system in acoustic and electric hearing.
Start: October 2018
Duration: 3 years
Funding: uni:docs fellowship program for doctoral candidates of the University of Vienna
Current cochlear implants (CIs) are very successful in restoring speech understanding in individuals with profound or complete hearing loss by electrically stimulating the auditory nerve. However, the ability of CI users to localize sound sources and to understand speech in complex listening situations, e.g. with interfering speakers, is dramatically reduced as compared to normal (acoustically) hearing listeners. From acoustic hearing studies it is known that interaural time difference (ITD) cues are essential for sound localization and speech understanding in noise. Users of current bilateral CI systems are, however, rather limited in their ability to perceive salient ITDs cues. One particular problem is that their ITD sensitivity is especially low when stimulating at relatively high pulses rates which are required for proper encoding of speech signals.
In this project we combine psychophysical studies in human bilaterally implanted listeners and physiological studies in bilaterally implanted animals to find ways in order to improve ITD sensitivity in electric hearing. We build on the previous finding that ITD sensitivity can be enhanced by introducing temporal jitter (Laback and Majdak, 2008) or short inter-pulse intervals (Hancock et al., 2012) in high-rate pulse sequences. Physiological experiments, performed at the Eaton-Peabody Laboratories Neural Coding Group (Massachusetts Eye and Ear Infirmary, Harvard Medical School, PI: Bertrand Delgutte), are combined with matched psychoacoustic experiments, performed at the EAP group of ARI (PI: Bernhard Laback). The main project milestones are the following:
· Aim 1: Effects of auditory deprivation and electric stimulation through CI on neural ITD sensitivity. In physiological experiments it is studied if chronic CI stimulation can reverse the effect of neonatal deafness on neural ITD sensitivity.
· Aim 2: Improving the delivery of ITD information with high-rate strategies for CI processors.
◦ A. Improving ITD sensitivity at high pulse rates by introducing short inter-pulse intervals
◦ B. Using short inter-pulse intervals to enhance ITD sensitivity with “pseudo-syllable” stimuli.
· External: Eaton-Peabody Laboratories Neural Coding Group des Massachusetts Eye and Ear Infirmary an der Harvard Medical School (PI: Bertrand Delgutte)
· Internal: Mathematik und Signalverarbeitung in der Akustik
· It is planned to run from 2014 to 2019.
· Article in DER STANDARD: http://derstandard.at/2000006635467/OeAW-und-Harvard-Medical-School-forschenCochleaimplantaten
Binaural hearing is extremely important in everyday life, most notably for sound localization and for understanding speech embedded in competing sound sources (e.g., other speech sources). While bilateral implantation has been shown to provide cochlear implant (CIs) listeners with some basic left/right localization ability, the performance with current CI systems is clearly reduced compared to normal hearing. Moreover, the binaural advantage in speech understanding in noise has been shown to be mediated mainly by the better-ear effect, while there is only very little binaural unmasking.
There exists now a body of literature on binaural sensitivity of CI listeners stimulated at a single interaural electrode pair. However, the CI listener’s sensitivity to binaural cues under more realistic conditions, i.e., with stimulation at multiple electrodes, has not been systematically addressed in depth so far.
This project attempts to fill this gap. In particular, given the high perceptual importance of ITDs, this project focuses on the systematic investigation of the sensitivity to ITD under various conditions of multi-electrode stimulation, including interference from neighboring channels, integration of ITD information across channels, and the perceptually tolerable room for degradations of binaural timing information.
Start: January 2013
Duration: 3 years
Localization of sound sources is an important task of the human auditory system and much research effort has been put into the development of audio devices for virtual acoustics, i.e. the reproduction of spatial sounds via headphones. Even though the process of sound localization is not completely understood yet, it is possible to simulate spatial sounds via headphones by using head-related transfer functions (HRTFs). HRTFs describe the filtering of the incoming sound due to head, torso and particularly the pinna and thus they strongly depend on the particular details in the listener's geometry. In general, for realistic spatial-sound reproduction via headphones, the individual HRTFs must be measured. As of 2012, the available HRTF acquisition methods were acoustic measurements: a technically-complex process, involving placing microphones into the listener's ears, and lasting for tens of minutes.
In LocaPhoto, we were working on an easily accessible method to acquire and evaluate listener-specific HRTFs. The idea was to numerically calculate HRTFs based on a geometrical representation of the listener (3-D mesh) obtained from 2-D photos by means of photogrammetric reconstruction.
As a result, we have developed a software package for numerical HRTF calculations, a method for geometry acquisition, and models able to evaluate HRTFs in terms of broadband ITDs and sagittal-plane sound localization performance.
The spatially oriented format for acoustics (SOFA) is dedicated to store all kinds of acoustic informations related to a specified geometrical setup. The main task is to describe simple HRTF measurements, but SOFA also aims to provide the functionality to store measurements of something fancy like BRIRs with a 64-channel mic-array in a multi-source excitation situation or directivity measurement of a loudspeaker. The format is intended to be easily extendable, highly portable, and actually the greatest common divider of all publicly available HRTF databases at the moment of writing.
SOFA defines the structure of data and meta data and stores them in a numerical container. The data description will be a hierarchical description when coming from free-field HRTFs (simple setup) and going to more complex setups like mic-array measurements in reverberant spaces, excited by a loudspeaker array (complex setup). We will use global geometry description (related to the room), and local geometry description (related to the listener/source) without limiting the number of acoustic transmitters and receivers. Room descriptions will be available by linking a CAD file within SOFA. Networking support will be provided as well allowing to remotely access HRTFs and BRIRs from client computers.
SOFA is being developed by many contributors worldwide. The development is coordinated at ARI by Piotr Majdak.
While it is often assumed that our auditory system is phase-deaf, there is a body of literature showing that listeners are very sensitive to phase differences between spectral components of a sound. Particularly, for spectral components falling into the same perceptual filter, the so-called auditory filter, a change in relative phase across components causes a change in the temporal pattern at the output of the filter. The phase response of the auditory filter is thus important for any auditory tasks that rely on within-channel temporal envelope information, most notably temporal pitch or interaural time differences.
Within-channel phase sensitivity has been used to derive a psychophysical measure of the phase response of auditory filters (Kohlrausch and Sanders, 1995). The basic idea of the widely used masking paradigm is that a harmonic complex whose phase curvature roughly mirrors the phase response of the auditory filter spectrally centered on the complex causes a maximally modulated (peaked) internal representation and, thus, elicits minimal masking of a pure tone target at the same center frequency. Therefore, systematic variation of the phase curvature of the harmonic complex (the masker) allows to estimate the auditory filter’s phase response: the masker phase curvature causing minimal masking reflects the mirrored phase response of the auditory filter.
Besides the obvious importance of detecting the target in the temporal dips of the masker, particularly of the target is short compared to the modulation period of the masker (Kohlrausch and Sanders, 1995), there are several indications that fast compression in the cochlea is important to obtain the masker-phase effect (e.g., Carlyon and Datta, 1997; Oxenham and Dau, 2004). One indication is that listeners with sensorineural hearing impairment (HI), characterized by reduced or absent cochlear compression due to loss of outer hair cells, show only a very weak masker-phase effect, making it difficult to estimate the cochlear phase response.
In the BiPhase project we propose a new paradigm for measuring the cochlear phase response that does not rely on cochlear compression and thus should be applicable in HI listeners. It relies on the idea that the amount of modulation (peakedness) in the internal representation of a harmonic complex, as given by its phase curvature, determines the listener’s sensitivity to envelope interaural time difference (ITD) imposed on the stimulus. Assuming that listener’s sensitivity to envelope ITD does not rely on compression, systematic variation of the stimulus phase curvature should allow to estimate the cochlear phase response both in normal-hearing (NH) and HI listeners. The main goals of BiPhase are the following:
This project is funded by the Austrian Science Fund (FWF, Project # P24183-N24, awarded to Bernhard Laback). It run from 2013 to 2017
In the context of binaural virtual acoustics, a sound source is positioned in a free-field 3-D space around the listener by filtering it via head-related transfer functions (HRTFs). In a real-time application, numerous HRTFs need to be processed. The long impulse responses of the HRTFs require a high computational power, which is difficult to directly implement on current processors in situations involving more than a few simultaneous sources.
Technically speaking, an HRTF is a linear time-invariant (LTI) system. An LTI system can be implemented in the time domain by direct convolution or recursive filtering. This approach is computationally inefficient. A computationally efficient approach consists of implementing the system in the frequency domain; however, this approach is not suitable for real-time applications since a very large delay is introduced. A compromise solution of both approaches is provided by a family of segmented-FFT methods, which permits a trade-off between latency and computational complexity. As an alternative, the sub-band method can be applied as a technique to represent linear systems in the time-frequency domain. Recent work has showed that the sub-band method offers an even better tradeoff between latency and computational complexity than segmented-FFT methods. However, the sub-band analysis is still mathematically challenging and its optimum configuration is dependant on the application under consideration.
TF-VA involves developing and investigating new techniques for configuring the sub-band method by using advanced optimization methods in a functional analysis context. As a result, an optimization technique that minimizes the computational complexity of the sub-band method will be obtained.
Two approaches will be considered: The first approach designs the time-frequency transform for minimizing the complexity of each HRTF. In the second approach, we will design a unique time-frequency transform, which will be used for a joint implementation of all HRTFs of a listener. This will permit an efficient implementation of interpolation techniques while moving sources spatially in real-time. The results will be evaluated in subjective localization experiments and in terms of localization models.
ExpSuite is a program that compiles the implementation of psychoacoustic experiments. ExpSuite is the name of a framework that is used as a basis for an application. It can be enlarged with customized and experiment-dependent methods (applications). The framework consists of a user-interface (experimentator-and-subject interface), signal processing modules (off-line and in real-time), and input-output modules.
The user-interface is implemented in Visual Basic.NET and benefits from the "Rapid Application Development" environment, which develops experiments quickly. To compensate for the sometimes slow processing performance of VB, the stimulation signals can be processed in a vector-oriented way using a direct link to MATLAB. Because of the direct link to MATLAB, numerous MATLAB intern functions are available to the ExpSuite applications.
The interface accessible to the people administering the tests contains several templates that can be chosen for a specific experiment. Either the keyboard, mouse, joypad, or joystick can be chosen as the input device. The user interface is designed for dual screen equipment, and allows a permanent surveillance of the experiment status on the same computer. Additionally, the transmission of the current experiment status to another computer is possible via a network connection.The framework supports two types of stimulation:
The aim of this project is to maintain the experimental facilities in our institute's laboratory.
The lab consists of four testing places:
The rooms are not only used for measurements and experiments, also the Acoustics Phonetics group is doing speech recordings for dialect research and speaker identification, for example for survey reports. The facilities are also used to detect psychoacoustical validations.
During the breaks in experiments, the subjects can use an Internet terminal or relax on a couch while sipping hot coffee...
Head-related transfer functions (HRTFs) describe sound transmission from the free field to a place in the ear canal in terms of linear time-invariant systems. They contain spectral and temporal features that vary according to the sound direction. Differences among subjects requires the measuring of subjects' individual HRTFs for studies on localization in virtual environments. In this project, a system for HRTF measurement was developed and installed in the semi-anechoic room at the Austrian Academy of Sciences.
Measurement of an HRTF was considered a system identification of the electro-acoustic chain: sound source-room-HRTF-microphone. The sounds in the ear canals were captured using in-ear microphones. The direction of the sound source was varied horizontally by rotating the subject on a turntable, and vertically by accessing one of the 22 loudspeakers positioned in the median plane. An optimized form of system identification with sweeps, the multiple exponential sweep method (MESM), was used for the measurement of transfer functions with satisfactory signal-to-noise ratios occurring within a reasonable amount of time. Subjects' positions were tracked during the measurement to ensure sufficient measurement accuracy. Measurement of headphone transfer functions was included in the HRTF measurement procedure. This allows equalization of headphone influence during the presentation of virtual stimuli.
Multi-channel audio equipment has been installed in the semi-anechoic room, giving access to recording and stimuli presentation via 24 channels simultaneously.
The multiple exponential sweep method was developed, allowing fast transfer function measurement of weakly non-linear time invariant systems for multiple sources.
The measurement procedure was developed and a database of HRTFs was created. Until now, HRTFs of over 200 subjects have been published, see http://sofacoustics.org/data/database/ari/. The HRTFs can be used to create virtual stimuli and present them binaurally via headphones.
To virtually position sounds in space, the HRTFs are used for filtering free-field sounds. This results in virtual acoustic stimuli (VAS). To create VAS and present them via headphones, applications called Virtual Sound Positioning (VSP) and Loca (Part of our ExpSuite Software Project) have been implemented. It allows virtual sound positioning in a free-field environment using both stationary and moving sound sources
The Acoustic Measurement Tool at the Acoustics Research Institute (AMTatARI) has been developed for the automatic measurement of system properties of electro-acoustic systems like loudspeakers and microphones. As a special function, this tool allows an automatic measurement of Head Related Transfer Functions (HRTF).
Measurement of the following features has been implemented so far:
The impulse responses can be measured with the Maximum Length Sequences (MLS) or with exponential sweeps. Whereas, in case of the sweeps, the new multiple exponential sweep method (MESM) is available. This method is also used to measure HRTFs with AMTatARI.