EAP

  • Localization of Sound Sources with Behind-the-Ear Microphones (Loca-BtE-CI)

    Objective and Method:

    Current cochlear implant (CI) systems are not designed for sound localization in the sagittal planes (front-back and up/down-dimensions). Nevertheless, some of the spectral cues that are important for sagittal plane localization in normal hearing (NH) listeners might be audible for CI listeners. Here, we studied 3-D localization with bilateral CI-listeners using "clinical" CI systems and with NH listeners. Noise sources were filtered with subject-specific head-related transfer functions, and a virtually structured environment was presented via a head-mounted display to provide feedback for learning. 

    Results:

    The CI listeners performed generally worse than NH listeners, both in the horizontal and vertical dimensions. The localization error decreases with an increase in the duration of training. The front/back confusion rate of trained CI listeners was comparable to that of untrained (naive) NH listeners and two times higher than for the trained NH listeners. 

    Application:

    The results indicate that some spectral localization cues are available to bilateral CI listeners, even though the localization performance is much worse than for NH listeners. These results clearly show the need for new strategies to encode spectral localization cues for CI listeners, and thus improve sagittal plane localization. Front-back discrimination is particularly important in traffic situations.

    Funding:

    FWF (Austrian Science Fund): Project # P18401-B15

    Publications:

    • Majdak, P., Goupell, M., and Laback, B. (2011). Two-Dimensional Localization of Virtual Sound Sources in Cochlear-Implant Listeners, Ear & Hearing.
    • Majdak, P., Laback, B., and Goupell, M. (2008). 3D-localization of virtual sound sources in normal-hearing and cochlear-implant listeners, presented at Acoustics '08  (ASA-EAA joint) conference, Paris
  • LocaMethods: Localization of Virtual Sound Sources

    Objective:

    Humans' ability to localize sound sources in a 3-D space was tested.

    Method:

    The subjects listened to noises filtered with subject-specific head-related transfer functions (HRTFs). In the first experiment with new subjects, the conditions included a type of visual environment (darkness or structured virtual world) presented via head mounted display (HMD) and pointing method (head and finger/shooter pointing).

    Results:

    The results show that the errors in the horizontal dimension were smaller when head pointing was used. Finger/shooter pointing showed smaller errors in the vertical dimension. Generally, the different effects of the two pointing methods was significant but small. The presence of a structured, virtual visual environment significantly improved the localization accuracy in all conditions. This supports the idea that using a visual virtual environment in acoustic tasks, like sound localization, is beneficial. In Experiment II, the subjects were trained before performing acoustic tasks for data collection. The performance improved for all subjects over time, which indicates that training is necessary to obtain stable results in localization experiments.

    Funding:

    FWF (Austrian Science Fund): Project # P18401-B15

    Publications:

    • Majdak, P., Goupell, M., and Laback, B. (2010). 3-D localization of virtual sound sources: effects of visual environment, pointing method, and training, Attention, Perception, & Psychophysics 72, 454-469.
    • Majdak, P., Laback, B., Goupell, M., and Mihocic M. (2008). "The Accuracy of Localizing Virtual Sound Sources: Effects of Pointing Method and Visual Environment", presented at AES convention, Amsterdam.
  • LocaPhoto: Localization Model & Numeric Simulations

    Localization of sound sources is an important task of the human auditory system and much research effort has been put into the development of audio devices for virtual acoustics, i.e. the reproduction of spatial sounds via headphones. Even though the process of sound localization is not completely understood yet, it is possible to simulate spatial sounds via headphones by using head-related transfer functions (HRTFs). HRTFs describe the filtering of the incoming sound due to head, torso and particularly the pinna and thus they strongly depend on the particular details in the listener's geometry. In general, for realistic spatial-sound reproduction via headphones, the individual HRTFs must be measured. As of 2012, the available HRTF acquisition methods were acoustic measurements: a technically-complex process, involving placing microphones into the listener's ears, and lasting for tens of minutes.

    In LocaPhoto, we were working on an easily accessible method to acquire and evaluate listener-specific HRTFs. The idea was to numerically calculate HRTFs based on a geometrical representation of the listener (3-D mesh) obtained from 2-D photos by means of photogrammetric reconstruction.

    As a result, we have developed a software package for numerical HRTF calculations, a method for geometry acquisition, and models able to evaluate HRTFs in terms of broadband ITDs and sagittal-plane sound localization performance.

     

    Further information:

    http://www.kfs.oeaw.ac.at/LocaPhoto

     

  • Measurement of Head-Related Transfer Functions (HRTFs)

    Objective:

    Head-related transfer functions (HRTFs) describe sound transmission from the free field to a place in the ear canal in terms of linear time-invariant systems. They contain spectral and temporal features that vary according to the sound direction. Differences among subjects requires the measuring of subjects' individual HRTFs for studies on localization in virtual environments. In this project, a system for HRTF measurement was developed and installed in the semi-anechoic room at the Austrian Academy of Sciences.

    Method:

    Measurement of an HRTF was considered a system identification of the electro-acoustic chain: sound source-room-HRTF-microphone. The sounds in the ear canals were captured using in-ear microphones. The direction of the sound source was varied horizontally by rotating the subject on a turntable, and vertically by accessing one of the 22 loudspeakers positioned in the median plane. An optimized form of system identification with sweeps, the multiple exponential sweep method (MESM), was used for the measurement of transfer functions with satisfactory signal-to-noise ratios occurring within a reasonable amount of time. Subjects' positions were tracked during the measurement to ensure sufficient measurement accuracy. Measurement of headphone transfer functions was included in the HRTF measurement procedure. This allows equalization of headphone influence during the presentation of virtual stimuli.

    Results:

    Multi-channel audio equipment has been installed in the semi-anechoic room, giving access to recording and stimuli presentation via 24 channels simultaneously.

    The multiple exponential sweep method was developed, allowing fast transfer function measurement of weakly non-linear time invariant systems for multiple sources.

    The measurement procedure was developed and a database of HRTFs was created. Until now, HRTFs of over 200 subjects have been published, see http://sofacoustics.org/data/database/ari/. The HRTFs can be used to create virtual stimuli and present them binaurally via headphones.

    To virtually position sounds in space, the HRTFs are used for filtering free-field sounds. This results in virtual acoustic stimuli (VAS). To create VAS and present them via headphones, applications called Virtual Sound Positioning (VSP) and Loca (Part of our ExpSuite Software Project) have been implemented. It allows virtual sound positioning in a free-field environment using both stationary and moving sound sources

  • Measurement of HRTFs for the Project CI-HRTF

    Objective:

    In this project, head-related transfer functions (HRTFs) are measured and prepared for localization tests with cochlear implant listeners. The method and apparatus used for the measurement is the same as used for the general HRTF measurement (see project HRTF-System); however, the place where sound is acquired is different. In this project, the microphones built into the behind-the-ear (BtE) processors of cochlear implantees are used. The processors are located on the pinna, and the unprocessed microphone signals are used to calculate the BtE-HRTFs for different spatial positions.

    The BtE-HRTFs are then used in localization tests like Loca BtE-CI.

  • MissiSIPI: Towards Improving Selective Hearing in Cochlear Implant Listeners

    Selective hearing refers to the ability of the human auditory system to selectively attend to a desired speaker while ignoring undesired, concurrent speakers. This is often referred to as the cocktail-party problem. In normal hearing, selective hearing is remarkably powerful. However, in so-called electric hearing, i.e., hearing with cochlear implants (CIs), selective hearing is severely degraded, close to not present at all. CIs are commonly used for treatment of severe-to-profound hearing loss or deafness because they provide good speech understanding in quiet. The reasons for the deficits in selective hearing are mainly twofold. First, they arise from structural limitations of current CI electrode designs which severely limit the spectral resolution. Second, they arise from a lack of salient timing cues, most importantly interaural time difference (ITD) and temporal pitch. The second limitation is assumed to be partly “software”-sided and conquerable with perception-driven signal processing. Yet, success achieved so far is at best moderate.

    A recently proposed approach to provide precise ITD and temporal-pitch cues in addition to speech understanding is to insert extra pulses with short inter-pulse intervals (so-called SIPI pulses) into periodic high-rate pulse trains. Results gathered so far in our previous project ITD PsyPhy in single-electrode configurations are encouraging in that both ITD and temporal-pitch sensitivity improved when SIPI pulses were inserted at the signals’ temporal-envelope peaks. Building on those results, this project aims to answer the most urgent research questions towards determining whether the SIPI approach improves selective hearing in CI listeners: Does the SIPI benefit translate into multi-electrode configurations? Does the multi-electrode SIPI approach harm speech understanding? Does the multi-electrode SIPI approach improve speech-in-speech understanding?

    Psychophysical experiments with CI listeners are planned to examine the research questions. To ensure high temporal precision and stimulus control, clinical CI signal processors will be bypassed by using a laboratory stimulation system directly connecting the CIs with a laboratory computer. The results are expected to shed light on parts of both electric and acoustic hearing that are still not fully understood to date, such as the role and the potential of temporal cues in selective hearing.


    References from our Lab:

    Duration: May 2020 - April 2022

    Funding: DOC Fellowship Program of the Austrian Academy of Sciences (A-25606)

    PI: Martin Lindenbeck

    Supervisors: Bernhard Laback and Ulrich Ansorge (University of Vienna)

    See also:

  • Number of Channels Required for Vertical Place Localization (Loca#Channels)

    Objective and Methods:

    This study investigates the effect of the number of frequency channels on vertical place sound localization, especially front/back discrimination. This is important to determine how many of the basal-most channels/electrodes of a cochlear implant (CI) are needed to encode spectral localization cues. Normal hearing subjects listening to a CI simulation (the newly developed GET vocoder) will perform the experiment using the localization method developed in the subproject "Loca Methods". Learning effects will be studied by obtaining visual feedback.

    Results:

    Experiments are underway.

    Application:

    Knowing the number of channels required to encode spectral cues for localization in the vertical planes is an important step in the development of a 3-D localization strategy for CIs. 

    Funding:

    FWF (Austrian Science Fund): Project #P18401-B15

    Publications:

    • Goupell, M., Majdak, P., and Laback, B. (2010). Median-plane sound localization as a function of the number of spectral channels using a channel vocoder, J. Acoust. Soc. Am. 127, 990-1001.
  • Perception of Interaural Intensity Differences by Cochlear Implant Listeners (IID-CI)

    Objective:

    This project investigated the perception of interaural intensity differences among cochlear implant (CI) listeners in relation to the spectral composition and the temporal structure of the signal.

    Method:

    The perception thresholds (just noticeable differences, JND) of CI listeners were examined using differently structured signals. The stimuli were applied directly to the clinical signal processing units, while the parameters of the ongoing stimulation were closely monitored.

    Results:

    JNDs of IIDs in CI listeners ranged from 1.5 - 2.5 dB for a detection level of 80 percent. The type of stimulus seems to bear little relevance on the detection performance, with the exception of one single type of signal - a pulse train with a frequency of 20 Hz. This means that JNDs of CI listeners are only irrelevantly higher than those of normal hearing listeners. CI implantees are sensitive to IIDs, and the JNDs correlate to a difference in arrival angles ranging from 5-10 degrees. Since the JNDs are within the minimal level widths of the transfer of amplitudes by the CI system, the reduction of level width in future systems seems advisable.

    Publication:

    • Laback, B., Pok, S. M., Baumgartner, W. D., Deutsch, W. A., and Schmid, K. (2004). “Sensitivity to interaural level and envelope time differences of two bilateral cochlear implant listeners using clinical sound processors,” Ear and Hearing 25, 5, 488-500.
  • Perception of Interaural Time Differences (ITD)

    Objective and Methods:

    This project cluster includes several studies on the perception of interaural time differences (ITD) in cochlear implant (CI), hearing impaired (HI), and normal hearing (NH) listeners. Studying different groups of listeners allows for identification of the factors that are most important to ITD perception. Furthermore, the comparison between the groups allows for the development of strategies to improve ITD sensitivity in CI and HI listeners.

    Subprojects:

    • FsGd: Effects of ITD in Ongoing, Onset, and Offset in Cochlear Implant Listeners
    • ITD Sync: Effects of interaural time difference in fine structure and envelope on lateral discrimination in electric hearing
    • ITD Jitter CI: Recovery from binaural adaptation with cochlear implants
    • ITD Jitter NH: Recovery from binaural adaptation in normal hearing
    • ITD Jitter HI: Recovery from binaural adaptation with sensorineural hearing impairment
    • ITD CF: Effect of center frequency and rate on the sensitivity to interaural delay in high-frequency click trains
    • IID-CI: Perception of Interaural Intensity Differences by Cochlear Implant Listeners

       

  • POTION: Perceptual Optimization of Audio Time-Frequency Representations and Coding.

    French-Austrian bilateral research project funded by the French National Agency of Research (ANR) and the Austrian Science Fund (FWF, project no. I 1362-N30). The project involves two academic partners, namely the Laboratory of Mechanics and Acoustics (LMA - CNRS UPR 7051, France) and the Acoustics Research Institute. At the ARI, two research groups are involved in the project: the Mathematics and Signal Processing in Acoustics and the Psychoacoustics and Experimental Audiology groups.

    Principal investigators: Thibaud Necciari (ARI), Piotr Majdak (ARI) and Olivier Derrien (LMA).

    Running period: 2014-2017 (project started on March 1, 2014).

    Abstract:

    One of the greatest challenges in signal processing is to develop efficient signal representations. An efficient representation extracts relevant information and describes it with a minimal amount of data. In the specific context of sound processing, and especially in audio coding, where the goal is to minimize the size of binary data required for storage or transmission, it is desirable that the representation takes into account human auditory perception and allows reconstruction with a controlled amount of perceived distortion. Over the last decades, many psychoacoustical studies investigated auditory masking, an important property of auditory perception. Masking refers to the degradation of the detection threshold of a sound in presence of another sound. The results were used to develop models of either spectral or temporal masking. Attempts were made to simply combine these models to account for time-frequency (t-f) masking effects in perceptual audio codecs. We recently conducted psychoacoustical studies on t-f masking. They revealed the inaccuracy of those models which revealed the inaccuracy of such simple models. These new data on t-f masking represent a crucial basis to account for masking effects in t-f representations of sounds. Although t-f representations are standard tools in audio processing, the development of a t-f representation of audio signals that is mathematically-founded, perception-based, perfectly invertible, and possibly with a minimum amount of redundancy, remains a challenge. POTION thus addresses the following questions:

    1. To what extent is it possible to obtain a perception-based (i.e., as close as possible to “what we see is what we hear”), perfectly invertible, and possibly minimally redundant t-f representation of sound signals? Such a representation is essential for modeling complex masking interactions in the t-f domain and is expected to improve our understanding of auditory processing of real-world sounds. Moreover, it is of fundamental interest for many audio applications involving sound analysis-synthesis.
    2. Is it possible to improve current perceptual audio codecs by considering a joint t-f approach? To reduce the size of digital audio files, perceptual audio codecs like MP3 decompose sounds into variable-length time segments, apply a frequency transform, and use masking models to control the sub-quantization of transform coefficients within each segment. Thus, current codecs follow mainly a spectral approach, although temporal masking effects are taken into account in some implementations. By combining an efficient perception-based t-f transform with a joint t-f masking model in an audio codec, we expect to achieve significant performance improvements.

    Working program:

    POTION is structured in three main tasks:

    1. Perception-based t-f representation of audio signals with perfect reconstruction: A linear and perfectly invertible t-f representation will be created by exploiting the recently developed non-stationary Gabor theory as a mathematical background. The transform will be designed so that t-f resolution mimics the t-f analysis properties by the auditory system and possibly no redundancy is introduced to maximize the coding efficiency.
    2. Development and implementation of a t-f masking model: Based on psychoacoustical data on t-f masking collected by the partners in previous projects and on literature data, a new, complex model of t-f masking will be developed and implemented in the computationally efficient representation built in task 1. Additional psychoacoustical data required for the development of the model, involving frequency, level, and duration effects in masking for either single or multiple maskers will be collected. The resulting signal processing algorithm should represent and re-synthesize only the perceptually relevant components of the signal. It will be calibrated and validated by conducting listening tests with synthetic and real-world sounds.
    3. Optimization of perceptual audio codecs: This task represents the main application of POTION. It will consist in combining the new efficient representation built in task 1 with the new t-f masking model built in task 2 for implementation in a perceptual audio codec.

    More information on the project can be found on the POTION web page.

    Publications:

    • Chardon, G., Necciari, Th., Balazs, P. (2014): Perceptual matching pursuit with Gabor dictionaries and time-frequency masking, in: Proceedings of the 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014). Florence, Italy, 3126-3130. (proceedings) ICASSP 2014: Perceptual matching pursuit results

    Related topics investigated at the ARI:

  • QWeight

    Reweighting of Binaural Cues: Generalizability and Applications in Cochlear Implant Listening

    Normal-hearing (NH) listeners use two binaural cues, the interaural time difference (ITD) and the interaural level difference (ILD), for sound localization in the horizontal plane. They apply frequency-dependent weights when combining them to determine the perceived azimuth of a sound source. Cochlear implant (CI) listeners, however, rely almost entirely on ILDs. This is partly due to the properties of current envelope-based CI-systems, which do not explicitly encode carrier ITDs. However, even if they are artificially conveyed via a research system, CI listeners perform worse on average than NH listeners. Since current CI-systems do not reliably convey ITD information, CI listeners might learn to ignore ITDs and focus on ILDs instead. A recent study in our lab provided first evidence that such reweighting of binaural cues is possible in NH listeners.

    This project aims to further investigate the phenomenon: First, we will test whether a changed ITD/ILD weighting will generalize to different frequency regions. Second, the effect of ITD/ILD reweighting on spatial release from speech-on-speech masking will be investigated, as listeners benefit particularly from ITDs in such tasks. And third, we will test, whether CI listeners can also be trained to weight ITDs more strongly and whether that translates to an increase in ITD sensitivity. Additionally, we will explore and evaluate different training methods to induce ITD/ILD reweighting.

    The results are expected to shed further light on the plasticity of the binaural auditory system in acoustic and electric hearing.

    Start:October 2018

    Duration:3 years

    Funding:uni:docs fellowship program for doctoral candidates of the University of Vienna

  • Recovery from Binaural Adaptation in Cochlear Implant Listeners (ITD Jitter CI)

    Objective:

    The sensitivity of normal hearing listeners to interaural time differences (ITD) in the envelope of high-frequency carriers is limited with respect to the envelope modulation rate. Increasing the envelope rate reduces the sensitivity, an effect that has been termed binaural adaptation (Hafter and Dye, 1983). Cochlear implant (CI) listeners show a similar limitation in ITD sensitivity with respect to the rate of unmodulated pulse trains containing ITD. Unfortunately, such high rates are needed to appropriately sample the modulation information of the acoustic signal. This study tests the ideas that (1) similar "binaural adaptation" mechanisms are limiting the performance in both subject groups, (2) the effect is related to the periodicity of pulse trains, and (3) introducing jitter (randomness) into the pulse timing causes a recovery from binaural adaptation and thus improves ITD sensitivity at higher pulse rates.

    Method and Results:

    These ideas have been studied by testing the ITD sensitivity of five CI listeners. The parameters' pulse rate, amount of jitter (where the minimum represents the periodic condition), and ITD were all varied. We showed that introducing binaurally synchronized jitter in the stimulation timing causes large improvements in ITD sensitivity at higher pulse rates (? 800 pps). Our experimental results demonstrate that a purely temporal trigger can cause recovery from binaural adaptation.

    Application:

    Applying binaurally jittered in stimulation strategies may improve several aspects of binaural hearing in bilateral recipients of CIs, including localization of sound sources and speech segregation in noise.

    Funding:

    Internal

    Publications:

    • Laback, B., and Majdak, P. (2007). Binaural jitter improves interaural time-difference sensitivity of cochlear implantees at high pulse rates, Proc Natl Acad Sci USA (PNAS) 105, 2, 814-817.
    • Laback, B., and Majdak, P. (2008). Reply to van Hoesel: Binaural jitter with cochlear implants, improved interaural time-delay sensitivity, and normal hearing, letter to Proc Natl Acad Sci USA 12, 105, 32.
    • Laback, B., and Majdak, P. (2007). Binaural stimulation in neural auditory prostheses or hearing aids, provisional US und EP patent application (submitted 20.06.07).
  • Recovery from Binaural Adaptation in Normal Hearing Listeners (ITD Jitter NH)

    Objective:

    The sensitivity of normal hearing (NH) listeners to interaural time differences (ITD) in the envelope of high-frequency carriers is limited with respect to the envelope modulation rate. Increasing the envelope rate reduces the sensitivity, an effect that has been termed binaural adaptation (Hafter and Dye, 1983). In another study (Laback and Majdak, 2008), it was hypothesized that introducing binaural jitter may improve ITD sensitivity in bilateral cochlear implant (CI) listeners by avoiding periodicity. Indeed, the results showed large improvements at high rates (≥ 800 pps). This was interpreted as an indication for a recovery from binaural adaptation. 

    In this study, we further investigated this effect using NH subjects. We attempted to understand the underlying mechanisms by applying a well-established model of peripheral auditory processing. 

    Method and Results:

    Bandpass-filtered clicks (4 kHz) with a pulse rate of 600 pps were used at a nominal pulse rate of 600 pulses per second (pps). It was found that randomly jittering the timing of the pulses significantly increases detectability of the ITD. A second experiment was performed to observe the effect of place and rate for pulse trains. It was shown that ITD sensitivity for jittered pulse trains at 1200 pps were significantly higher than periodic pulse trains at 600 pps. Therefore, with the addition of jitter, listeners were not solely benefiting from the longest interpulse intervals and instances of reduced rate. A third experiment, using a 900 pps pulse train, confirmed the improvement in ITD sensitivity. This occurred even when random amplitude modulation, a side-effect in the case of large amounts of jitter, is ruled out. A model of peripheral auditory processing up to the brain stem (Nucleus Cochlearis) has been applied to study the mechanisms underlying the improvements in ITD sensitivity. It was found that the irregular timing of the jittered pulses increases the synchrony of firing of the cochlear nucleus. These results suggest that a recovery from binaural adaptation activated by a temporal irregularity is possibly occurring at the level of the cochlear nucleus.

    Application:

    Together with the results of Laback and Majdak (2008) on the effect of binaural jitter in CI listeners, these results suggest that the binaural adaptation effect first observed by Hafter and Dye (1983) is related to the synchrony of neural firings across auditory nerve fibers. The nerve fibers, in turn, innervate cochlear nucleus cells. At higher rates, periodic pulse trains result in little synchrony of the response to the ongoing signal. Jittering the pulse timing increases the probability of synchronous firing across AN fibers at certain instances of time. Further studies are required to determine if other aspects of binaural adaptation can also be attributed to this explanation. 

    Funding:

    Internal

    Publications:

    • Goupell, M. J., Laback, B., Majdak, P. (2009): Enhancing sensitivity to interaural time differences at high modulation rates by introducing temporal jitter, in: J. Acoust. Soc. Am. 126, 2511-2521.
    • Laback, B., and Majdak, P. (2007): Binaural jitter improves interaural time-difference sensitivity of cochlear implantees at high pulse rates, in: Proc. Natl. Acad. Sci. USA (PNAS) 105, 2, 814-817.
    • Laback, B., and Majdak, P. (2008): Reply to van Hoesel: Binaural jitter with cochlear implants, improved interaural time-delay sensitivity, and normal hearing, letter to Proc. Natl. Acad. Sci. USA 12, 105, 32.
  • Recovery from Binaural Adaptation in Sensorineural Hearing Impairment (ITD Jitter HI)

    Objective:

    Normal hearing (NH) listener sensitivity to interaural time differences (ITD) in the envelope of high-frequency carriers is limited with respect to the envelope modulation rate. Increasing the envelope rate reduces the sensitivity, an effect that has been termed binaural adaptation (Hafter and Dye, 1983). In other studies (Laback and Majdak, 2008; Goupell et al., 2008), it has been shown that introducing binaural jitter improves ITD sensitivity at higher rates in bilateral cochlear implant (CI) listeners as well as in NH listeners. The results were interpreted in terms of a recovery from binaural adaptation. Sensorineural hearing impairment often results in reduced ITD sensitivity (e.g. Hawkins and Wightman, 1980). The present study investigates if a similar recovery from binaural adaptation, and thus an improvement in ITD sensitivity, can be achieved in hearing impaired listeners. 

    Method and Results:

    Bandpass-filtered clicks (4 kHz) with pulse rates of 400 and 600 pulses per second (pps) are used. Different amounts of jitter (the minimum representing the periodic condition) and different ITDs are tested. Listeners with a moderate cochlear hearing loss are selected. Additional stimuli tested are bandpass-filtered noise bands at 4 kHz and low-frequency stimuli at 500 Hz (sinusoids, SAMs, noise bands  and jittered pulse trains). The levels of the stimuli are adjusted in pretests to achieve a centered auditory image at a comfortable loudness.

    Data collected so far show improvements in ITD sensitivity in some individuals but not in others.

    Application:

    The results may lead to the design of a new hearing aid processing algorithm that attempts to improve ITD sensitivity.

    Funding:

    Internal

  • Sensitivity to Spectral Peaks and Notches (SpecSens)

    Objective and Methods:

    Spectral peaks and notches are important cues that normal hearing listeners use to localize sounds in the vertical planes (the front/back and up/down dimensions). This study investigates to what extent cochlear implant (CI) listeners are sensitive to spectral peaks and notches imposed upon a constant-loudness background. 

    Results:

    Listeners could always detect peaks, but not always notches. Increasing the bandwidth beyond two electrodes showed no improvement in thresholds. The high-frequency place was significantly worse than the low and middle places; although, listeners had highly-individual tendencies. Thresholds decreased with an increase in the height of the peak. Thresholds for detecting a change in the frequency of a peak or notch were approximately one electrode. Level roving significantly increased thresholds. Thus, there is currently no indication that CI listeners can perform a "true" profile analysis. Future studies will explore if adding temporal cues or roving the level in equal loudness steps, instead of equal-current steps (as in the present study), is relevant for profile analysis.

    Application:

    Data on the sensitivity to spectral peaks and notches are required to encode spectral localization cues in future CI stimulation strategies. 

    Funding:

    FWF (Austrian Science Fund): Project #P18401-B15

    Publications:

    • Goupell, M., Laback, B., Majdak, P., and Baumgartner, W. D. (2008). Current-level discrimination and spectral profile analysis in multi-channel electrical stimulation, J. Acoust. Soc. Am. 124, 3142-57.
    • Goupell, M. J., Laback, B., Majdak, P., and Baumgartner, W-D. (2007). Sensitivity to spectral peaks and notches in cochlear implant listeners, presented at Conference on Implantable Auditory Prostheses (CIAP), Lake Tahoe.
  • SOFA: Spatially Oriented Format for Acoustics

    The spatially oriented format for acoustics (SOFA) is dedicated to store all kinds of acoustic informations related to a specified geometrical setup. The main task is to describe simple HRTF measurements, but SOFA also aims to provide the functionality to store measurements of something fancy like BRIRs with a 64-channel mic-array in a multi-source excitation situation or directivity measurement of a loudspeaker. The format is intended to be easily extendable, highly portable, and actually the greatest common divider of all publicly available HRTF databases at the moment of writing.

    SOFA defines the structure of data and meta data and stores them in a numerical container. The data description will be a hierarchical description when coming from free-field HRTFs (simple setup) and going to more complex setups like mic-array measurements in reverberant spaces, excited by a loudspeaker array (complex setup). We will use global geometry description (related to the room), and local geometry description (related to the listener/source) without limiting the number of acoustic transmitters and receivers. Room descriptions will be available by linking a CAD file within SOFA. Networking support will be provided as well allowing to remotely access HRTFs and BRIRs from client computers.

    SOFA is being developed by many contributors worldwide. The development is coordinated at ARI by Piotr Majdak.

    Further information:

    www.sofaconventions.org.
  • softpinna: Non-Rigid Registration for the Calculation of HRTFs

    Millions of people use headphones everyday for listening to music, for watching movies, or when communicating with others. Nevertheless, the sounds presented via headphones are usually perceived inside the head and not at their actual natural spatial position. This limited perception is inherent and results in unrealistic listening situations.

    When listening to a sound without headphones, the acoustic information of the sound source is modified by our head and our torso, an effect described by the head-related transfer functions (HRTFs). The shape of our ears contributes to that modification by filtering the sound depending on the source direction. But the ear is very listener-specific – its individuality is similar to that of a finger print, and thus HRTFs are very listener-specific. When listening to sounds via headphones, the listener-specific filtering is usually not available. One of the main reasons is the difficulty in the process of acquisition of the ear shape of a person, and thus in calculation of listener-specific HRTFs.

    Thus, in softpinna, we will work on the development of new methods for a better acquisition of listener-specific ear shapes of a person. Specifically, we will investigate and improve the so-called "non-rigid registration" (NRR) algorithms, applied on 3-D ear geometries calculated from 2-D photos of a person’s ears. The improvement in the quality of the 3-D ear geometries acquisition will allow computer programs to accurately calculate the listener-specific HRTFs, thus enabling the incorporation of listener-specific HRTFs in future headphone systems providing realistic presentation of spatial sounds. The new ear-shape acquisition method will vastly reduce the technical requirements for accurate calculation of listener-specific HRTFs.

    This project is done in collaboration with Dreamwaves GmbH. It is supported by the Bridge Programme of the FFG

  • Spectral cues in auditory localization with cochlear implants

    Objective:

    Bilateral use of current cochlear implant (CI) systems allows for the localization of sound sources in the left-right dimension. However, localization in the front-back and up-down dimensions (within the so-called sagittal planes) is restricted as a result of insufficient transmission of the relevant information.

  • Spectral Cues in Auditory Localization with Cochlear Implants (CI HRTF)

    Objective:

    Bilateral use of current cochlear implant (CI) systems allows for the localization of sound sources in the left-right dimension. However, localization in the front-back and up-down dimensions (within the so-called sagittal planes) is restricted as a result of insufficient transmission of the relevant information.

    Method:

    In normal hearing listeners, localization within the sagittal planes is mediated when the pinna (outer ear) evaluates the spectral coloring of incoming waveforms at higher frequencies. Current CI systems do not provide these so-called pinna cues (or spectral cues), because of behind-the-ear microphone placement and the processor's limited analysis-frequency range.

    While these technical limitations are relatively manageable, some fundamental questions arise:

    • What is the minimum number of channels required to encode the pinna cues relevant to vertical plane localization?
    • To what extent can CI listeners learn to localize sound sources using pinna cues that are mapped to tonotopic regions associated with lower characteristic frequencies (according to the position of typically implanted electrodes)?
    • Which modifications of stimulation strategies are required to facilitate the localization of sound sources for CI listeners?

    Application:

    The improvement of sound source localization in the front-back dimension is regarded as an important aspect in daily traffic safety.

    Funding:

    FWF (Austrian Science Fund): Project #P18401-B15

    Status:

    Finished in Sept. 2010

    Subprojects:

    • ElecRang: Effects of upper-frequency boundary and spectral warping on speech intelligibility in electrical stimulation
    • SpecSens: Sensitivity to spectral peaks and notches
    • Loca-BtE-CI: Localization with behind-the-ear microphones
    • Loca Methods: Pointer method for localizing sound sources
    • Loca#Channels: Number of channels required for median place localization
    • SpatStrat: Development and evaluation of a spatialization strategy for cochlear implants
    • HRTF-Sim: Numerical simulation of HRTFs
  • SpExCue: Role of spectral cues in sound externalization - objective measures & modeling

    Baumgartner et al. (2017a)

    Räumliches Hören ist wichtig, um die Umgebung ständig auf interessante oder gefährliche Geräusche zu überwachen und gezielt die Aufmerksam auf sie richten zu können. Die räumliche Trennung der beiden Ohren und die komplexe Geometrie des menschlichen Körpers liefern akustische Information über den Ort einer Schallquelle. Je nach Schalleinfallsrichtung verändert v.a. die Ohrmuschel das Klangspektrum, bevor der Schall das Trommelfell erreicht. Da die Ohrmuschel sehr individuell geformt ist (mehr noch als ein Fingerabdruck), ist auch deren Klangfärbung sehr individuell. Für die künstliche Erzeugung realistischer Hörwahrnehmungen muss diese Individualität so präzise wie nötig abgebildet werden, wobei bisher nicht geklärt ist, was wirklich nötig ist. SpExCue hat deshalb nach elektrophysiologischen Maßen und Vorhersagemodellen geforscht, die abbilden können, wie räumlich realistisch („externalisiert“) eine virtuelle Quelle empfunden wird.

    Da künstliche Quellen vorzugsweise im Kopf wahrgenommen werden, eignete sich die Untersuchung dieser Klangspektren zugleich zur Erforschung einer Verzerrung in der Hörwahrnehmung: Schallereignisse, die sich dem Hörer annähern, werden intensiver wahrgenommen als jene, die sich vom Hörer entfernen. Frühere Studien zeigten diese Verzerrung ausschließlich durch Lautheitsänderungen (zunehmende/abnehmende Lautheit wurde verwendet um sich nähernde/entfernende Schallereignisse zu simulieren). Es war daher unklar, ob die Verzerrung wirklich auf Wahrnehmungsunterschiede gegenüber der Bewegungsrichtung oder nur auf die unterschiedlichen Lautstärken zurück zu führen sind. Unsere Studie konnte nachweisen, dass räumliche Änderungen der Klangfarbe diese Verzerrungen (auf Verhaltensebene und elektrophysiologisch) auch bei gleichbleibender Lautstärke hervorrufen können und somit von einer allgemeinen Wahrnehmungsverzerrung auszugehen ist.

    Des Weiteren untersuchte SpExCue, wie die Kombination verschiedener räumlicher Hörinformation die Aufmerksamkeitskontrolle in einer Spracherkennungsaufgabe mit gleichzeitigen Sprechern, wie z.B. bei einer Cocktailparty, beeinflusst. Wir fanden heraus, dass natürliche Kombinationen räumlicher Hörinformation mehr Gehinraktivität in Vorbereitung auf das Testsignal herrufen und dadurch die neurale Verarbeitung der zu folgenden Sprache optimiert wird.

    SpExCue verglich außerdem verschiedene Ansätze von Berechnungsmodellen, die darauf abzielen, die räumliche Wahrnehmung von Klangänderungen vorherzusagen. Obwohl viele frühere experimentelle Ergebnisse von mindestens einem der Modellansätze vorhergesagt werden konnten, konnte keines von ihnen all diese Ergebnisse erklären. Um das zukünftige Erstellen von allgemeingültigeren Berechnungsmodellen für den räumlichen Hörsinn zu unterstützen, haben wir abschließend ein konzeptionelles kognitives Modell dafür entwickelt.

    Funding

    Erwin-Schrödinger Fellowship from Austrian Science Funds (FWF, J3803-N30) awarded to Robert Baumgartner. Duration: May 2016 - November 2017.

    Follow-up funding provided by Facebook Reality Labs, since March 2018. Project Investigator: Robert Baumgartner.

    Publications

    • Baumgartner, R., Reed, D.K., Tóth, B., Best, V., Majdak, P., Colburn H.S., Shinn-Cunningham B. (2017): Asymmetries in behavioral and neural responses to spectral cues demonstrate the generality of auditory looming bias, in: Proceedings of the National Academy of Sciences of the USA 114, 9743-9748. (article)
    • Baumgartner, R., Majdak, P., Colburn H.S., Shinn-Cunningham B. (2017): Modeling Sound Externalization Based on Listener-specific Spectral Cues, presented at: Acoustics ‘17 Boston: The 3rd Joint Meeting of the Acoustical Society of America and the European Acoustics Association. Boston, MA, USA. (conference)
    • Deng, Y., Choi, I., Shinn-Cunningham, B., Baumgartner, R. (2019): Impoverished auditory cues limit engagement of brain networks controlling spatial selective attention, in: Neuroimage 202, 116151. (article)
    • Baumgartner, R., Majdak, P. (2019): Predicting Externalization of Anechoic Sounds, in: Proceedings of ICA 2019. (proceedings)
    • Majdak, P., Baumgartner, R., Jenny, C. (2019): Formation of three-dimensional auditory space, in: arXiv:1901.03990 [q-bio]. (preprint)