Project

  • Objective:

    Head-related transfer functions (HRTF) describe the sound transmission from the free field to a place in the ear canal in terms of linear time-invariant systems. Due to the physiological differences of the listeners' outer ears, the measurement of each subject's individual HRTFs is crucial for sound localization in virtual environments (virtual reality).

    Measurement of an HRTF can be considered a system identification of the weakly non-linear electro-acoustic chain from the sound source room's HRTF microphone. An optimized formulation of the system identification with exponential sweeps, called the "multiple exponential sweep method" (MESM), was used for the measurement of transfer functions. For this measurement of transfer functions, either the measurement duration or the signal-to-noise ratio could be optimized.

    Initial heuristic experiments have shown that using Gabor multipliers to extract the relevant sweeps in the MESM post-processing procedure improves the signal-to-noise ratio of the measured data even further. The objective of this project is to study, in detail, how frame multipliers can optimally be used during this post-processing procedure. In particular, wavelet frames, which best fit the structure of an exponential sweep, will be studied.

    Method:

    Systematic numeric experiments will be conducted with simulated slowly time-variant, weakly non-linear systems. As the parameters of the involved signals are precisely known and controlled, an optimal symbol will automatically be created. Finally, the efficiency of the new method will be tested on a "real world" system, which was developed and installed in the semi-anechoic room of the Institute. It uses in-ear microphones, a subject turntable, 22 loudspeakers on a vertical arc, and a head tracker.

    Application:

    The new method will be used for improved HRTF measurement.

  • Beschreibung:

    Es wird ein Formantsythesizer, basierend auf dem Klatt Synthesizer, implementiert, der sowohl zur Erzeugung stationärer Vokale und auch zeitvarianter Formant- und Grundfrequenzspuren verwendet werden kann. Die Implementierung erfolgt als SP-Atom.

    Anwendung

    Die Synthese wird als Kontrollwerkzeug in die Anwendungen Viewer2 (Spektorgramm und Parameter Plot) und SPEXL (Segmentations-Tool) eingebunden. Dazu wird eine graphische Steuerung implementiert, die geeignete Funktionen zur Eingabe von Formantdaten (Vokalsynthese) und zur graphischen Auswahl von Parametersätzen (Resynthese von Parameterverläufen) zur Verfügung stellt.

  • General Information

    Funded by the Vienna Science and Technology Fund (WWTF) within the  "Mathematics and …2016"  Call (MA16-053)

    Principal Investigator: Georg Tauböck

    Co-Principal Investigator: Peter Balazs

    Project Team: Günther Koliander, José Luis Romero  

    Duration: 01.07.2017 – 01.07.2021

    Abstract

    Signal processing is a key technology that forms the backbone of important developments like MP3, digital television, mobile communications, and wireless networking and is thus of exceptional relevance to economy and society in general. The overall goal of the proposed project is to derive highly efficient signal processing algorithms and to tailor them to dedicated applications in acoustics. We will develop methods that are able to exploit structural properties in infinite-dimensional signal spaces, since typically ad hoc restrictions to finite dimensions do not sufficiently preserve physically available structure. The approach adopted in this project is based on a combination of the powerful mathematical methodologies frame theory (FT), compressive sensing (CS), and information theory (IT). In particular, we aim at extending finite-dimensional CS methods to infinite dimensions, while fully maintaining their structure-exploiting power, even if only a finite number of variables are processed. We will pursue three acoustic applications, which will strongly benefit from the devised signal processing techniques, i.e., audio signal restoration, localization of sound sources, and underwater acoustic communications. The project is set up as an interdisciplinary endeavor in order to leverage the interrelations between mathematical foundations, CS, FT, IT, time-frequency representations, wave propagation, transceiver design, the human auditory system, and performance evaluation.

    Keywords

    compressive sensing, frame theory, information theory, signal processing, super resolution, phase retrieval, audio, acoustics

    Video

    Link

     

  • Objective:

    So-called Gabor multipliers are particular cases of time-variant filters. Recently, Gabor systems on irregular grids have become a popular research topic. This project deals with Gabor multipliers, as a specialization of frame multipliers on irregular grids.

    Method:

    The initial stage of this project aims to investigate the continuous dependence of an irregular Gabor multiplier on its parameter (i.e. the symbol), window, and lattice. Furthermore, an algorithm to find the best approximation of any matrix (i.e. any time-variant system) by such an irregular Gabor multiplier is being developed.

    Application:

    Gabor multipliers have been used implicitly for quite some time. Investigating the properties of these operators is a current topic for signal processing engineers. If the standard time-frequency grid is not useful to the application, it is natural to work with irregular grids. An example of this is the usage of non-linear frequency scales, like bark scales.

    Partners:

    H. G. Feichtinger, NuHAG, Faculty of Mathematics, University of Vienna

    Project-completion:

    This project ended on 28.02.2008 and is incorporated into the 'High Potential'-Project of the WWTF, MULAC (WWTF 2007).

  • Objective:

    General frame theory can be more specialized if a structure is imposed on the elements of the frame in question. One possible, very natural structure is sequences of shifts of the same function. In this project, irregular shifts are investigated.

    Method:

    In this project, the connection to irregular Gabor multipliers will be explored. Using the Kohn Nirenberg correspondence, the space spanned by Gabor multipliers is just a space spanned by translates. Furthermore, the special connection of the Gramian function and the Grame matrix for this case will be investigated.

    Application:

    A typical example of frames of translates is filter banks, which have constant shapes. For example, the phase vocoder corresponds to a filter bank with regular shifts. Introducing an irregular shift gives rise to a generalization of this analysis / synthesis system.

    Partners:

    • S. Heineken, Research Group on Real and Harmonic Analysis, University of Buenos Aires
  • Objective:

    An irrelevance algorithm based on simultaneous masking is implemented In STx. In the years following its first development by Eckel, the efficiency of this algorithm has been clearly shown. In this project, this irrelevance model will be based on modern mathematic and psychoacoustic theories and knowledge.

    Method:

    This algorithm can be described as a Gabor multiplier with an adaptive symbol. With existing related theory, it becomes clear that a high redundancy must be selected. This guarantees:

    • perfect reconstruction synthesis
    • an under-spread operator for good time-frequency localization
    • a smoothing-out of easily detectable quick on/off cycles

    Furthermore, it can be shown that the model used for the spreading function here is mathematically equivalent to the excitation pattern.

    Application:

    This algorithm has been used for several years already for things such as:

    • automobile sound design
    • over-masking for background-foreground separation
    • improved speech recognition in noise
    • contrast increase for hearing-impaired persons

    Partners:

    • G. Eckel, Institut für Elektronische Musik und Akustik, Graz

    Publications:

    • P. Balazs, B. Laback, G. Eckel, W. Deutsch, "Introducing Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking", IEEE Transactions on Audio, Speech and Language Processing, Vol. 17 (7) , in press (2009) , preprint

    Project-completion:

    This project ended on 01.01.2010, and leads to a sub-project of the 'High Potential'-Project of the WWTF, MULAC.

  • ITD MultEl: Binaural-Timing Sensitivity in Multi-Electrode Stimulation

    Binaural hearing is extremely important in everyday life, most notably for sound localization and for understanding speech embedded in competing sound sources (e.g., other speech sources). While bilateral implantation has been shown to provide cochlear implant (CIs) listeners with some basic left/right localization ability, the performance with current CI systems is clearly reduced compared to normal hearing. Moreover, the binaural advantage in speech understanding in noise has been shown to be mediated mainly by the better-ear effect, while there is only very little binaural unmasking.

    There exists now a body of literature on binaural sensitivity of CI listeners stimulated at a single interaural electrode pair. However, the CI listener’s sensitivity to binaural cues under more realistic conditions, i.e., with stimulation at multiple electrodes, has not been systematically addressed in depth so far.

    This project attempts to fill this gap. In particular, given the high perceptual importance of ITDs, this project focuses on the systematic investigation of the sensitivity to ITD under various conditions of multi-electrode stimulation, including interference from neighboring channels, integration of ITD information across channels, and the perceptually tolerable room for degradations of binaural timing information.

    Involved people:

    Start: January 2013

    Duration: 3 years

    Funding: MED-EL

  • Bilateral Cochlear Implants: Physiology and Psychophysics

    Current cochlear implants (CIs) are very successful in restoring speech understanding in individuals with profound or complete hearing loss by electrically stimulating the auditory nerve. However, the ability of CI users to localize sound sources and to understand speech in complex listening situations, e.g. with interfering speakers, is dramatically reduced as compared to normal (acoustically) hearing listeners. From acoustic hearing studies it is known that interaural time difference (ITD) cues are essential for sound localization and speech understanding in noise. Users of current bilateral CI systems are, however, rather limited in their ability to perceive salient ITDs cues. One particular problem is that their ITD sensitivity is especially low when stimulating at relatively high pulses rates which are required for proper encoding of speech signals.  

    In this project we combine psychophysical studies in human bilaterally implanted listeners and physiological studies in bilaterally implanted animals to find ways in order to improve ITD sensitivity in electric hearing. We build on the previous finding that ITD sensitivity can be enhanced by introducing temporal jitter (Laback and Majdak, 2008) or short inter-pulse intervals (Hancock et al., 2012) in high-rate pulse sequences. Physiological experiments, performed at the Eaton-Peabody Laboratories Neural Coding Group (Massachusetts Eye and Ear Infirmary, Harvard Medical School, PI: Bertrand Delgutte), are combined with matched psychoacoustic experiments, performed at the EAP group of ARI (PI: Bernhard Laback). The main project milestones are the following:

    ·        Aim 1: Effects of auditory deprivation and electric stimulation through CI on neural ITD sensitivity. In physiological experiments it is studied if chronic CI stimulation can reverse the effect of neonatal deafness on neural ITD sensitivity.

    ·        Aim 2: Improving the delivery of ITD information with high-rate strategies for CI processors.

      A. Improving ITD sensitivity at high pulse rates by introducing short inter-pulse intervals

      B. Using short inter-pulse intervals to enhance ITD sensitivity with “pseudo-syllable” stimuli.

    Co-operation partners:

    ·        External: Eaton-Peabody Laboratories Neural Coding Group des Massachusetts Eye and Ear Infirmary an der Harvard Medical School (PI: Bertrand Delgutte)

    ·        Internal: Mathematik und Signalverarbeitung in der Akustik

    Funding:

    ·     This project is funded by the National Institute of Health (NIH).http://grantome.com/grant/NIH/R01-DC005775-11A1

    ·     It is planned to run from 2014 to 2019.

    Press information:

    ·     Article in DER STANDARD: http://derstandard.at/2000006635467/OeAW-und-Harvard-Medical-School-forschenCochleaimplantaten

    ·     Article in DIE PRESSE:http://diepresse.com/home/science/3893396/Eine-Prothese-die-in-der-Horschnecke-sitzt

    ·     OEAW website:http://www.oeaw.ac.at/oesterreichische-akademie-der-wissenschaften/news/article/transatlantische-hoerhilfe/

    Publications

    See Also

    ITD MultEl

  • The aim of this project is to maintain the experimental facilities in our institute's laboratory.

    The lab consists of four testing places:

    • GREEN and BLUE: Two sound-booths (IAC-1202A) are used for audio recording and psychoacoustic testing performed with headphones. Each of the booths is controlled from outside by a computer. Two bidirectional audio channels with sampling rates up to 192 kHz are available.
    • RED: A visually-separated corner can be used for experiments with cochlear implant listeners. A computer controls the experimental procedure using a bilateral, direct-electric stimulation.
    • YELLOW: A semi-anechoic room, with a size of 6 x 6 x 3 m, can be used for acoustic tests and measurements in a nearly-free field. As many as 24 bidirectional audio channels, virtual environments generated by a head mounted display, and audio and video surveillance are available for projects like HRTF measurement, localization tests or acoustic holography.

    The rooms are not only used for measurements and experiments, also the Acoustics Phonetics group is doing speech recordings for dialect research and speaker identification, for example for survey reports. The facilities are also used to detect psychoacoustical validations.

    During the breaks in experiments, the subjects can use an Internet terminal or relax on a couch while sipping hot coffee...

  • Introduction                                                                                                                                                   

    Rumble strips are (typically periodic) grooves place at the side of the road. When a vehicle passes over a rumble strip the noise and vibration in the car should alert the driver of the imminent danger of running off the road. Thus, rumble strips have been shown to have a positive effect on traffic safety. Unfortunately, the use of rumble strips in the close vicinity of populated areas is problematic due to the increased noise burden.

    Aims

    The aim of the project LARS (LärmArme RumpelStreifen or low noise rumble strips) was to find rumble strip designs that cause less noise in the environment without significantly affecting the alerting effect inside the vehicle. For this purpose, a number of conventional designs as well as three alternative concepts were investigated: conical grooves to guide the noise under the car, pseudo-random groove spacing to reduce tonality and thus annoyance, as well as sinusoidal depth profiles which should produce mostly vibration and only little noise and which are already used in practice.

    Methods

    Two test tracks were established covering a range of different milling patterns in order to measure the effects of rumble strips for a car and a commercial vehicle running over them. Acoustic measurements using microphones and a head-and-torso-simulator were done inside the vehicle as well as in the surroundings of the track. Furthermore, the vibration of the steering wheel and the driver seat were measured. Using the acoustics measurements, synthetic rumble strip noises were produced, in order to get a wider range of possible rumble strip designs than by pure measurements.

    Perception tests with 16 listeners were performed where the annoyance of the immissions as well as the urgency and reaction times for the sounds generated in the interior were determined also using the synthetic stimuli.

    LARS was funded by the FFG (project 840515) and the ASFINAG. The project was done in cooperation with the Research Center of Railway Engineering, Traffic Economics and Ropeways, Institute of Transportation, Vienna University of Technology, and ABF Strassensanierungs GmbH.

  • Objective:

    The aim of this study is to investigate the phonetics of second language acquisition and first language attrition, based on the acoustic and articulatory lateral realizations of Bosnian migrants living in Vienna. Bosnian has two lateral phonemes (a palatalized and an alveolar/velarized one), whereas Standard Austrian German features only one lateral phoneme (an alveolar lateral). In the Viennese dialect however, this phoneme also has a velarized variant.

    This phonetic investigation will be conducted with respect to the influence of language contact between Bosnian and SAG, and Bosnian and the Viennese dialect, as well as concerning the influence of gender and identity construction.

    Method:

    The recordings will be conducted with female and male Bosnian speakers, aged between 20 and 35 years at the time of emigration, who came to Vienna during the Bosnian war 1992-1995. Additionally, control groups of monolingual L1 speakers of Bosnian, SAG and Vd will be recorded. All recordings will include reading tasks in order to elicit controlled speech, as well as spontaneous speech in the form of biographical interviews. The analyses will comprise quantitative and qualitative aspects. Quantitatively, the acoustic parameters formant frequencies (especially F2 and F3), duration and intensity of the laterals and their phonetic surrounding will be analyzed. Additionally, articulatory analyses will be performed using EPG and UTI data. Qualitatively, biographical information, language attitudes and social networks will be analysed in order to obtain information about speaker-specific or group-specific characteristics.

    Application:

    The results of this study are relevant to understanding the processes of sound-realization and sound-change in the domains of language contact (phonetic processes in second language acquisition and first language attrition), sociolinguistics, and the sociology of identity construction

  • Beschreibung

     

    Wir danken dem FWF für die Förderung des Projekts mit der Nummer I 4299-N32

    Schallquellenlokalisierungsverfahren sind weit verbreitet in der Automobil-, Schienenfahrzeug- und Luftfahrtindustrie. Viele verschiedene Methoden stehen für die Analyse von ruhenden Schallquellen zur Verfügung. Geeignete Verfahren für bewegte Schallquellen kämpfen nach wie vor mit den Problemstellungen der Dopplerverschiebung, der vergleichsweise kurzen Messzeiten und Ausbreitungseffekten durch die umgebende Atmosphäre. Das Projekt LION kombiniert die Expertise von vier Arbeitsgruppen aus drei verschiedenen Ländern im Bereich der Schallquellenlokalisierung: Die Beuth Hochschule für Technik Berlin (Beuth), das Fachgebiet Turbomaschinen- und Thermoakustik der TU Berlin (TUB), das Akustische Forschungsinstitut (ARI) der Österreichischen Akademie der Wissenschaften und das Schweizer Forschungslabor für Akustik / Lärmminderung der EMPA. Die genannten Institutionen kooperieren, um die existierenden Methoden zur Analyse von bewegten Schallquellen zu erweitern und zu verbessern. Dabei soll der Dynamikbereich erweitert sowie die räumliche und die Frequenzauflösung erhöht werden. Die neuen Verfahren sollen auf komplexe Probleme wie die Analyse von tonalen Quellen mit starker Richtcharakteristik oder kohärenten, räumlich verteilten Quellen angewandt werden.

     

    Methoden

    Die Partner werden die Methoden gemeinsam entwickeln, validieren und Synergieeffekte heben, die sich durch diese Partnerkonstellation ergeben. Beuth plant, die Methode der äquivalenten Schallquellen im Frequenzbereich auf bewegte Quellen im Halbraum zu erweitern und dabei die Einflüsse des Bodens und der Schallausbreitung in der Atmosphäre zu berücksichtigen. ARI steuern die akustische Holografie, die Hauptkomponentenanalyse und die Methode der unabhängigen Komponenten bei und möchten diese zusammen mit ihrer Expertise für vorbeifahrende Züge nutzen, um numerische Randelementeverfahren inklusive der Transformation vom stehenden in das bewegte Bezugssystem zu verbessern. TUB entwickelt Optimierungsmethoden und modellbasierte Ansätze für die Lokalisierung von bewegten Schallquellen und bringt eine umfangreiche Datenbasis an mit einer großen Anzahl von Mikrofonen erfassten Überflugversuchsdaten ins Projekt ein. EMPA fügt seine Expertise zur Schallausbreitungsmodellierung mit atmosphärischer Turbulenz und Bodeneffekten basierend auf zeitvarianten digitalen Filtern hinzu. Sie werden überdies einen synthetischen Testfall zur Validerung der erweiterten und verbesserten Schalllokalisierungsmethoden aufsetzen. Das Projekt ist für eine Laufzeit von drei Jahren geplant. Das Arbeitsprogramm ist in vier Arbeitspakete organisiert: 1) Entwicklung der Algorithmen und Modelle, 2) die Entwicklung einer virtuellen Testumgebung für die Methoden, 3) die Simulation von Szenarien in der virtuellen Testumgebung und 4) die Anwendung der verbesserten und erweiterten Verfahren auf existierende Mikrofonmessungen von Zügen und Flugzeugen.

     

  • Objective and Method:

    Current cochlear implant (CI) systems are not designed for sound localization in the sagittal planes (front-back and up/down-dimensions). Nevertheless, some of the spectral cues that are important for sagittal plane localization in normal hearing (NH) listeners might be audible for CI listeners. Here, we studied 3-D localization with bilateral CI-listeners using "clinical" CI systems and with NH listeners. Noise sources were filtered with subject-specific head-related transfer functions, and a virtually structured environment was presented via a head-mounted display to provide feedback for learning. 

    Results:

    The CI listeners performed generally worse than NH listeners, both in the horizontal and vertical dimensions. The localization error decreases with an increase in the duration of training. The front/back confusion rate of trained CI listeners was comparable to that of untrained (naive) NH listeners and two times higher than for the trained NH listeners. 

    Application:

    The results indicate that some spectral localization cues are available to bilateral CI listeners, even though the localization performance is much worse than for NH listeners. These results clearly show the need for new strategies to encode spectral localization cues for CI listeners, and thus improve sagittal plane localization. Front-back discrimination is particularly important in traffic situations.

    Funding:

    FWF (Austrian Science Fund): Project # P18401-B15

    Publications:

    • Majdak, P., Goupell, M., and Laback, B. (2011). Two-Dimensional Localization of Virtual Sound Sources in Cochlear-Implant Listeners, Ear & Hearing.
    • Majdak, P., Laback, B., and Goupell, M. (2008). 3D-localization of virtual sound sources in normal-hearing and cochlear-implant listeners, presented at Acoustics '08  (ASA-EAA joint) conference, Paris
  • Objective:

    Humans' ability to localize sound sources in a 3-D space was tested.

    Method:

    The subjects listened to noises filtered with subject-specific head-related transfer functions (HRTFs). In the first experiment with new subjects, the conditions included a type of visual environment (darkness or structured virtual world) presented via head mounted display (HMD) and pointing method (head and finger/shooter pointing).

    Results:

    The results show that the errors in the horizontal dimension were smaller when head pointing was used. Finger/shooter pointing showed smaller errors in the vertical dimension. Generally, the different effects of the two pointing methods was significant but small. The presence of a structured, virtual visual environment significantly improved the localization accuracy in all conditions. This supports the idea that using a visual virtual environment in acoustic tasks, like sound localization, is beneficial. In Experiment II, the subjects were trained before performing acoustic tasks for data collection. The performance improved for all subjects over time, which indicates that training is necessary to obtain stable results in localization experiments.

    Funding:

    FWF (Austrian Science Fund): Project # P18401-B15

    Publications:

    • Majdak, P., Goupell, M., and Laback, B. (2010). 3-D localization of virtual sound sources: effects of visual environment, pointing method, and training, Attention, Perception, & Psychophysics 72, 454-469.
    • Majdak, P., Laback, B., Goupell, M., and Mihocic M. (2008). "The Accuracy of Localizing Virtual Sound Sources: Effects of Pointing Method and Visual Environment", presented at AES convention, Amsterdam.
  • Localization of sound sources is an important task of the human auditory system and much research effort has been put into the development of audio devices for virtual acoustics, i.e. the reproduction of spatial sounds via headphones. Even though the process of sound localization is not completely understood yet, it is possible to simulate spatial sounds via headphones by using head-related transfer functions (HRTFs). HRTFs describe the filtering of the incoming sound due to head, torso and particularly the pinna and thus they strongly depend on the particular details in the listener's geometry. In general, for realistic spatial-sound reproduction via headphones, the individual HRTFs must be measured. As of 2012, the available HRTF acquisition methods were acoustic measurements: a technically-complex process, involving placing microphones into the listener's ears, and lasting for tens of minutes.

    In LocaPhoto, we were working on an easily accessible method to acquire and evaluate listener-specific HRTFs. The idea was to numerically calculate HRTFs based on a geometrical representation of the listener (3-D mesh) obtained from 2-D photos by means of photogrammetric reconstruction.

    As a result, we have developed a software package for numerical HRTF calculations, a method for geometry acquisition, and models able to evaluate HRTFs in terms of broadband ITDs and sagittal-plane sound localization performance.

     

    Further information:

    http://www.kfs.oeaw.ac.at/LocaPhoto

     

  • Objective:

    It is known in psychoacoustics that not all information contained in a "real world" acoustic signal is processed by the human auditory system. More precisely, it turns out that some time-frequency components mask (overshadow) other components that are close in time or frequency.

    In the software S_TOOLS-STx developed by the Institute, an algorithm based on simultaneous masking has been implemented. This algorithm removes perceptually irrelevant time-frequency components. In this implementation, the model is described as a Gabor multiplier with an adaptive symbol.

    In this project, the masking model will be extended to a true time-frequency model, incorporating frequency and temporal masking.

    Method:

    Experiments have been conducted (in cooperation with the Laboratory for Mechanics and Acoustics / CNRS Marseille) to test the time-frequency masking properties of a single Gaussian atom, and to study the additivity of these masking properties for several Gaussian atoms.

    The results of these experiments will be used, in combination with theoretical results obtained in the parallel projects studying the mathematical properties of frame multipliers, to approximate or identify the masking model by wavelet and Gabor multipliers.

    The obtained model will then be validated by appropriate psychoacoustical experiments.

    Application:

    Efficient implementation of a masking filter offers many applications:

    • Sound / Data Compression
    • Sound Design
    • Back-and-Foreground Separation
    • Optimization of Speech and Music Perception

    After completing the testing phase, the algorithms are to be implemented in S_TOOLS-STx. 

    Publications:

    • P. Balazs, B. Laback, G. Eckel, W. Deutsch, "Introducing Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking", IEEE Transactions on Audio, Speech and Language Processing (2009), in press
    • B. Laback, P. Balazs, G. Toupin, T. Necciari, S. Savel, S. Meunier, S. Ystad and R. Kronland-Martinet, "Additivity of auditory masking using Gaussian-shaped tones", Acoustics'08, Paris, 29.06.-04.07.2008 (03.07.2008)
    • B. Laback, P. Balazs, T. Necciari, S. Savel, S. Ystad, S. Meunier and R. Kronland-Martinet, "Additivity of auditory masking for Gaussian-shaped tone pulses", preprint
  • Objective:

    This project is part of a project cluster that investigates time-frequency masking in the auditory system, in cooperation with the Laboratory for Mechanics and Acoustics / CNRS Marseille. While other subprojects study the spread of masking across the time-frequency plane using Gaussian-shaped tones, this subproject investigates how multiple Gaussian maskers distributed across the time-frequency plane create masking that adds up at a given time-frequency point. This question is important in determining the total masking effect resulting from the multiple time-frequency components (that can be modeled as Gaussian Atoms) of a real-life signal.

    Method:

    Both the maskers and the target are Gaussian-shaped tones with a frequency of 4 kHz. A two-stage approach is applied to measure the additivity of auditory masking. In the first stage, the levels of the maskers are adjusted to cause the same amount of masking in the target. In the second stage, various combinations of those maskers are tested to study their additivity.

    In the first study, the maskers are spread either in time OR in frequency. In the second study, the maskers are spread in time AND in frequency.

    Application:

    New insight into the coding of sound in the auditory system could help to design more efficient audio codecs. These codecs could take the additivity of time-frequency masking into account.

    Funding:

    WTZ (project AMADEUS)

    Publications:

    • Laback, B., Balazs, P., Toupin, G., Necciari, T., Savel, S., Meunier, S., Ystad, S., Kronland-Martinet, R. (2008). Additivity of auditory masking using Gaussian-shaped tones, presented at Acoustics? 08 conference, Paris.
  • Objective:

    Many problems in physics can be formulated as operator theory problems, such as in differential or integral equations. To function numerically, the operators must be discretized. One way to achieve discretization is to find (possibly infinite) matrices describing these operators using ONBs. In this project, we will use frames to investigate a way to describe an operator as a matrix.

    Method:

    The standard matrix description of operators O using an ONB (e_k) involves constructing a matrix M with the entries M_{j,k} = < O e_k, e_j>. In past publications, a concept that described operator R in a very similar way has been presented. However, this description of R used a frame and its canonical dual. Currently, a similar representation is being used for the description of operators using Gabor frames. In this project, we are going to develop and completely generalize this idea for Bessel sequences, frames, and Riesz sequences. We will also look at the dual function that assigns an operator to a matrix.

    Application:

    This "sampling of operators" is especially important for application areas where frames are heavily used, so that the link between model and discretization is maintained. To facilitate implementations, operator equations can be transformed into a finite, discrete problem with the finite section method (much in the same way as in the ONB case).

    Publications:

    • P. Balazs, "Matrix Representation of Operators Using Frames", Sampling Theory in Signal and Image Processing (STSIP) (2007, accepted), preprint
    • P. Balazs, "Hilbert-Schmidt Operators and Frames - Classification, Approximation by Multipliers and Algorithms" , International Journal of Wavelets, Multiresolution and Information Processing, (2007, accepted)  preprint, Codes and Pictures: here
  • Objective:

    Head-related transfer functions (HRTFs) describe sound transmission from the free field to a place in the ear canal in terms of linear time-invariant systems. They contain spectral and temporal features that vary according to the sound direction. Differences among subjects requires the measuring of subjects' individual HRTFs for studies on localization in virtual environments. In this project, a system for HRTF measurement was developed and installed in the semi-anechoic room at the Austrian Academy of Sciences.

    Method:

    Measurement of an HRTF was considered a system identification of the electro-acoustic chain: sound source-room-HRTF-microphone. The sounds in the ear canals were captured using in-ear microphones. The direction of the sound source was varied horizontally by rotating the subject on a turntable, and vertically by accessing one of the 22 loudspeakers positioned in the median plane. An optimized form of system identification with sweeps, the multiple exponential sweep method (MESM), was used for the measurement of transfer functions with satisfactory signal-to-noise ratios occurring within a reasonable amount of time. Subjects' positions were tracked during the measurement to ensure sufficient measurement accuracy. Measurement of headphone transfer functions was included in the HRTF measurement procedure. This allows equalization of headphone influence during the presentation of virtual stimuli.

    Results:

    Multi-channel audio equipment has been installed in the semi-anechoic room, giving access to recording and stimuli presentation via 24 channels simultaneously.

    The multiple exponential sweep method was developed, allowing fast transfer function measurement of weakly non-linear time invariant systems for multiple sources.

    The measurement procedure was developed and a database of HRTFs was created. Until now, HRTFs of over 200 subjects have been published, see http://sofacoustics.org/data/database/ari/. The HRTFs can be used to create virtual stimuli and present them binaurally via headphones.

    To virtually position sounds in space, the HRTFs are used for filtering free-field sounds. This results in virtual acoustic stimuli (VAS). To create VAS and present them via headphones, applications called Virtual Sound Positioning (VSP) and Loca (Part of our ExpSuite Software Project) have been implemented. It allows virtual sound positioning in a free-field environment using both stationary and moving sound sources

  • Objective:

    In this project, head-related transfer functions (HRTFs) are measured and prepared for localization tests with cochlear implant listeners. The method and apparatus used for the measurement is the same as used for the general HRTF measurement (see project HRTF-System); however, the place where sound is acquired is different. In this project, the microphones built into the behind-the-ear (BtE) processors of cochlear implantees are used. The processors are located on the pinna, and the unprocessed microphone signals are used to calculate the BtE-HRTFs for different spatial positions.

    The BtE-HRTFs are then used in localization tests like Loca BtE-CI.