• Measurement of Head-Related Transfer Functions (HRTFs)


    Head-related transfer functions (HRTFs) describe sound transmission from the free field to a place in the ear canal in terms of linear time-invariant systems. They contain spectral and temporal features that vary according to the sound direction. Differences among subjects requires the measuring of subjects' individual HRTFs for studies on localization in virtual environments. In this project, a system for HRTF measurement was developed and installed in the semi-anechoic room at the Austrian Academy of Sciences.


    Measurement of an HRTF was considered a system identification of the electro-acoustic chain: sound source-room-HRTF-microphone. The sounds in the ear canals were captured using in-ear microphones. The direction of the sound source was varied horizontally by rotating the subject on a turntable, and vertically by accessing one of the 22 loudspeakers positioned in the median plane. An optimized form of system identification with sweeps, the multiple exponential sweep method (MESM), was used for the measurement of transfer functions with satisfactory signal-to-noise ratios occurring within a reasonable amount of time. Subjects' positions were tracked during the measurement to ensure sufficient measurement accuracy. Measurement of headphone transfer functions was included in the HRTF measurement procedure. This allows equalization of headphone influence during the presentation of virtual stimuli.


    Multi-channel audio equipment has been installed in the semi-anechoic room, giving access to recording and stimuli presentation via 24 channels simultaneously.

    The multiple exponential sweep method was developed, allowing fast transfer function measurement of weakly non-linear time invariant systems for multiple sources.

    The measurement procedure was developed and a database of HRTFs was created. Until now, HRTFs of over 200 subjects have been published, see The HRTFs can be used to create virtual stimuli and present them binaurally via headphones.

    To virtually position sounds in space, the HRTFs are used for filtering free-field sounds. This results in virtual acoustic stimuli (VAS). To create VAS and present them via headphones, applications called Virtual Sound Positioning (VSP) and Loca (Part of our ExpSuite Software Project) have been implemented. It allows virtual sound positioning in a free-field environment using both stationary and moving sound sources

  • Measurement of HRTFs for the Project CI-HRTF


    In this project, head-related transfer functions (HRTFs) are measured and prepared for localization tests with cochlear implant listeners. The method and apparatus used for the measurement is the same as used for the general HRTF measurement (see project HRTF-System); however, the place where sound is acquired is different. In this project, the microphones built into the behind-the-ear (BtE) processors of cochlear implantees are used. The processors are located on the pinna, and the unprocessed microphone signals are used to calculate the BtE-HRTFs for different spatial positions.

    The BtE-HRTFs are then used in localization tests like Loca BtE-CI.

  • Measurements of test runs of the Austrian railways OEBB


    The Acoustic Research Institute was mandated to do measurements with the acoustic 64-channel microphone array using the beam forming method to derive a source model for high speed trains according to the new guideline CNOSSOS-EU.


    The beam forming method was used, because the train is a fast moving vehicle and therefore a transient acoustic source. Five heights were used in the evaluation based on the CNOSSOS-EU and additionally five heights were evaluated that fit to the geometry of the trains.


    Speeds from 200 km/h up to 330 km/h were tested for the ICEs and from 200 km/h up to 250 km/h for the Railjet. At the same speed both trains had the same acoustic level.

  • MESMGabMul: Improving the Multiple Exponential Sweep Method (MESM) using Gabor Multipliers


    The Multiple Exponential Sweep Method (MESM) is an optimized method for the semi-simultaneous system identification of multiple systems. It uses an appropriate overlapping of the excitation signals. This leads to a faster identification of the weakly nonlinear systems that are retrieving the linear impulse response only. Using a Gabor multiplier in the post-processing procedure of the system identification may reduce the measurement noise. This may further improve the signal-to-noise ratio of the measured data.


    A Gabor multiplier is used to cut the interesting parts out of the measured signals in the time-frequency plane. This allows a specific optimization of signal parts, independent of the frequency. Initial tests applying a Gabor multiplier to simulated data showed that the depth of spectral notches could be raised. A systematic investigation of this method is the main goal this project.


    This method may improve the signal-to-noise ratio in system identification tasks of any weakly nonlinear system, such as those involving acoustic measurements with electric equipment.


    • P. Majdak, P. Balazs, B.Laback, "Multiple Exponential Sweep Method for Fast Measurement of Head Related Transfer Functions", Journal of the Acoustical Engineering Society , Vol. 55, No. 7/8, July/August 2007, Pages 623 - 637 (2007)


    This project ended on 28.02.2008 and is incorporated into the 'High Potential'-Project of the WWTF, MULAC (q.v.).

  • MissiSIPI: Towards Improving Selective Hearing in Cochlear Implant Listeners

    Selective hearing refers to the ability of the human auditory system to selectively attend to a desired speaker while ignoring undesired, concurrent speakers. This is often referred to as the cocktail-party problem. In normal hearing, selective hearing is remarkably powerful. However, in so-called electric hearing, i.e., hearing with cochlear implants (CIs), selective hearing is severely degraded, close to not present at all. CIs are commonly used for treatment of severe-to-profound hearing loss or deafness because they provide good speech understanding in quiet. The reasons for the deficits in selective hearing are mainly twofold. First, they arise from structural limitations of current CI electrode designs which severely limit the spectral resolution. Second, they arise from a lack of salient timing cues, most importantly interaural time difference (ITD) and temporal pitch. The second limitation is assumed to be partly “software”-sided and conquerable with perception-driven signal processing. Yet, success achieved so far is at best moderate.

    A recently proposed approach to provide precise ITD and temporal-pitch cues in addition to speech understanding is to insert extra pulses with short inter-pulse intervals (so-called SIPI pulses) into periodic high-rate pulse trains. Results gathered so far in our previous project ITD PsyPhy in single-electrode configurations are encouraging in that both ITD and temporal-pitch sensitivity improved when SIPI pulses were inserted at the signals’ temporal-envelope peaks. Building on those results, this project aims to answer the most urgent research questions towards determining whether the SIPI approach improves selective hearing in CI listeners: Does the SIPI benefit translate into multi-electrode configurations? Does the multi-electrode SIPI approach harm speech understanding? Does the multi-electrode SIPI approach improve speech-in-speech understanding?

    Psychophysical experiments with CI listeners are planned to examine the research questions. To ensure high temporal precision and stimulus control, clinical CI signal processors will be bypassed by using a laboratory stimulation system directly connecting the CIs with a laboratory computer. The results are expected to shed light on parts of both electric and acoustic hearing that are still not fully understood to date, such as the role and the potential of temporal cues in selective hearing.

    References from our Lab:

    Duration: May 2020 - April 2022

    Funding: DOC Fellowship Program of the Austrian Academy of Sciences (A-25606)

    PI: Martin Lindenbeck

    Supervisors: Bernhard Laback and Ulrich Ansorge (University of Vienna)

    See also:

  • MPEG4-Features for Diadem


    In Cooperation with National Instruments an implementation of MPEG4 features in the software package DIADEM is planned.


    The application of MPEG4 features to noise is proven. Now the implementation of MPEG4 features into DIADEM is planned. In preparation of the project additional features were implemented into STX. The implementation into DIADEM is projected in the future.


    DIADEM is a database that allows for a rapid search of measurement recordings. New search indexes can be generated based on the MPEG4 features of the recordings.

  • MulAcARI: Theory and Application of Multipliers in Acoustics

    Basic Description:

    Time-variant filters are gaining importance in today's signal processing applications. Gabor multipliers in particular are popular in current scientific investigations. These multipliers are a specialization of Bessel multipliers to Gabor frames. These operators are interesting in regard to both theory and application:

    Theory of Multipliers:

    • Bessel and Frame Multipliers in Banach Spaces: In this project, the concept of frame multipliers should be generalized to work with Banach spaces.
    • Theory of Wavelet Multipliers: The concept of multipliers can be easily extended to wavelet frames. The influence of the special structures of these sequences will be investigated.
    • Basic Properties of Irregular Gabor Multipliers: Here multipliers for Gabor frames on irregular lattices are investigated.

    Application of Multipliers:

    • Time Frequency Masking: Gabor Multiplier Models and Evaluation: The symbol for the Gabor multiplier is calculated adaptively and the resulting model incorporates both time and frequency masking components. The goal is to obtain an algorithm using 2-D convolution.
    • Improving the Multiple Exponential Sweep Method (MESM) using Gabor Multipliers: The MESM is an efficient system identification method. Initial tests have shown that this method can be improved with a Gabor multiplier applied as a mask for the original sweep.
    • Wavelet Multipliers and Their Application to Reflection Measurements: One method to calculate the absorption coefficient of a sound proof wall requires separation of the impulse responses of different reflections. They can be easily separated in a scalogram and they can be extracted using a wavelet multiplier.
    • Mathematical Foundation of the Irrelevance Model: In this project, the theoretical foundation of the irrelevance algorithms implemented in STx is being developed.


    • H.G. Feichtinger, K. Gröchenig et al., NuHAG, Faculty of Mathematics, University of Vienna
    • R. Kronland-Martinet, S. Ytad, T. Necciari, Modélisation, Synthèse et Contrôle des Signaux Sonores et Musicaux of the LMA / CRNS Marseille
    • S. Meunier, S. Savel, Acoustique perceptive et qualité de l’environnement sonore of the LMA / CRNS Marseille


    • P. Balazs, B. Laback, G. Eckel, W. Deutsch, "Introducing Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking", IEEE Transactions on Audio, Speech and Language Processing, Vol. 17 (7) , in press (2009) , preprint
    • P. Majdak, P. Balazs, B.Laback, "Multiple Exponential Sweep Method for Fast Measurement of Head Related Transfer Functions", Journal of the Acoustical Engineering Society , Vol. 55, No. 7/8, July/August 2007, Pages 623 - 637 (2007)


    This project ended on 01.01.2010; most subprojects ended on 28.02.2008 and are incorporated into the 'High Potential'-Project of the WWTF, MULAC.

  • MulAcWWTF: Frame Multipliers: Theory and Application in Acoustics

    Basic Description:

    Signal processing has entered into today's life on a broad range, from mobile phones, UMTS, xDSL, and digital television to scientific research such as psychoacoustic modeling, acoustic measurements, and hearing prosthesis. Such applications often use time-invariant filters by applying the Fourier transform to calculate the complex spectrum. The spectrum is then multiplied by a function, the so-called transfer function. Such an operator can therefore be called a Fourier multiplier. Real life signals are seldom found to be stationary. Quasi-stationarity and fast-time variance characterize the majority of speech signals, transients in music, or environmental sounds, and therefore imply the need for non-stationary system models. Considerable progress can be achieved by reaching beyond traditional Fourier techniques and improving current time-variant filter concepts through application of the basic mathematical concepts of frame multipliers.

    Several transforms, such as the Gabor transform (the sampled version of the Short-Time Fourier Transformation), the wavelet transform, and the Bark, Mel, and Gamma tone filter banks are already in use in a large number of signal processing applications. Generalization of these techniques can be obtained via the mathematical frame theory. The advantage of introducing the frame theory consists particularly in the interpretability of filter and analysis coefficients in terms of frequency and time localization, as opposed to techniques based on orthonormal bases.

    One possibility to construct time-variant filters exists through the use of Gabor multipliers. For these operators the result of a Gabor transform is multiplied by a given function, called the time-frequency mask or symbol, followed by re-synthesis. These operators are already used implicitly in engineering applications, and have been investigated as Gabor filters in the fields of mathematics and signal processing theory. If alternative transforms are used, the concept of multipliers can be extended appropriately. So, for example, the concept of wavelet multipliers could be investigated for a wavelet transform.

    Different kinds of applications call for different frames. Multipliers can be generalized to the abstract level of frames without any further structure. This concept will be further investigated in this project. Its feasibility will be evaluated in acoustic applications using special cases of Gabor and wavelet systems.

    The project goal is to study both the mathematical theory of frame multipliers and their application among selected problems in acoustics. The project is divided into the following subprojects:

    Theory of Multipliers:

    1. General Frame Multiplier Theory
    2. Analytic and Numeric Properties of Gabor Multipliers
    3. Analytic and Numeric Properties of Wavelet Multipliers

    Application of Multipliers:

    1. Mathematical Modeling of Auditory Time-Frequency Masking Functions
    2. Improvement of Head-Related Transfer Function Measurements
    3. Advanced Method of Sound Absorption Measurements


    • H.G. Feichtinger et al., NuHAG, Faculty of Mathematics, University of Vienna
    • R. Kronland-Martinet et al., Modélisation, Synthèse et Contrôle des Signaux Sonores et Musicaux of the LMA / CNRS Marseille
    • B. Torrésani et al., LATP Université de Provence / CNRS Marseille
    • J.P. Antoine et al., FYMA Université Catholique de Louvain


    • P. Balazs, J.-P. Antoine, A. Gryboś, "Weighted and Controlled Frames: Mutual relationship and first Numerical Properties",  accepted for publication in International Journal of Wavelets, Multiresolution and Information Processing (2009), preprint
    • P. Balazs, “Matrix Representation of Bounded Linear Operators By Bessel Sequences, Frames and Riesz Sequence“,SampTA'09, 8th International Conference on Sampling and Applications, May 2009, Marseille, France
    • A. Rahimi, P. Balazs, "Multipliers for  p-Bessel sequences in Banach spaces", submitted (2009)
    • D. Stoeva, P. Balazs, "Unconditional convergence and Invertibility of Multipliers", preprint (2009)
    • Monika Dörfler and Bruno Torrésani, “Representation of operators in the time-frequency domain and generalized Gabor multipliers”, J. Fourier Anal. Appl., 2009 (in press)
    • Yohan Frutiger: "Multiplicateurs de Gabor pour les transformations sonores" (Gabor Multipliers for sound transformations) Master thesis under the supervision of R. Kronland-Martinet, June 2008 
    • F. Jaillet, P. Balazs, M. Dörfler and N. Engelputzeder, “On the Structure of the Phase around the Zeros of the Short-Time Fourier Transform”, NAG/DAGA 2009, International Conference on Acoustics, March 2009, Rotterdam, Nederland
    • F. Jaillet, P. Balazs and M. Dörfler, “Nonstationary Gabor Frames”, SampTA'09, 8th International Conference on Sampling and Applications, May 2009, Marseille, France
    • P. Balazs, B. Laback, G. Eckel, W. Deutsch, "Introducing Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking", IEEE Transactions on Audio, Speech and Language Processing (2009), in press
    •  B. Laback, P. Balazs, G. Toupin, T. Necciari, S. Savel, S. Meunier, S. Ystad and R. Kronland-Martinet, "Additivity of auditory masking using Gaussian-shaped tones", Acoustics'08, Paris, 29.06.-04.07.2008 (03.07.2008)
    • B. Laback, P. Balazs, T. Necciari, S. Savel, S. Ystad, S. Meunier and R. Kronland-Martinet, "Additivity of auditory masking for Gaussian-shaped tone pulses", preprint
    • Anaïk Olivero: "Expérimentation des multiplicateurs temps-échelle" (On the time-scale multipliers) Master thesis under the supervision of R. Kronland-Martinet and B. Torrésani, June 2008
  • Multilevel Fast Multipole Method (MLFMM)


    The Multilevel Fast Multipole Method, when used in combination with the Boundary Element Method (BEM), is a tool to significantly speed up the simulation of large objects almost without loss in accuracy.


    The Fast Multipole Method subdivides the Boundary Element mesh into different clusters. If two clusters are sufficiently far away from each other (i.e. they are in each other's far field), all calculations that would have to be made for every pair of nodes can be reduced to the midpoints of the clusters with almost no loss of accuracy. For clusters not in the far field, the traditional BEM has to be applied. The Multilevel Fast Multipole Method introduces different levels of clustering (clusters made out of smaller clusters) to additionally enhance computation speed.


    The MLFFM is used for the simulation of head related transfer functions. The diagram above compares the result of a classical BEM with the MLFMM.

  • Musicology

    This area is involved with the analysis of the acoustics of music and with human perception thereof.

    In close cooperation with em.o.Univ.Prof. Dr. Franz Födermayr (Inst. of Musicology, Univ.Vienna) historic recordings of Georgian multipart songs are analyzed and transcribed.

  • New Approaches in Ray-Tracing and Boundary Element Method


    An important difficulty of ray-tracing and boundary element method is the fine grid, which is needed in the high frequency region.


    By means of new alternating shape functions e.g. wavelets at the boundary it could be possible to define a grid on the boundary that is independent from the wave number.

  • Noise - Effects and Control

    Noise Abatement: investigates the acoustic and psychoacoustic description of unwanted sounds and supports the specification of methods for reducing noise, from whatever source (Sound Quality Design).

    Perceiving sound as noise is a subjective reaction to disturbing acoustic signals. The intensity, pitch, sharpness, variation and roughness as well as the subjective attitude and motivation all play a role in the perceived noisiness. Railway noise is the main detractor when planning new high-speed tracks. The condition of the wheels and the track has a significant effect on the sound generation (see also: harmonisation). Literature:

    • NOIDESC: Deskriptoren von Lärmsignalen: Deutsch Werner A. & Waubke Holger (2004) .
    • Descriptoren für aircraft noise
    • Erschütterungen an Bahntrassen. Waubke Holger (2004).
    • Visualisierung von Bahnlärm (1996). AK08. in: Deutsch, Werner A. & Elisabeth Hilscher & Herta Spielmann (eds.): Tagungsband der Österreichischen Physikalischen Gesellschaft, Johannes Kepler, Universität Linz. Wien: Forschungsstelle für Schallforschung der Österreichischen Akademie der Wissenschaften, pp.27-29.
  • Number of Channels Required for Vertical Place Localization (Loca#Channels)

    Objective and Methods:

    This study investigates the effect of the number of frequency channels on vertical place sound localization, especially front/back discrimination. This is important to determine how many of the basal-most channels/electrodes of a cochlear implant (CI) are needed to encode spectral localization cues. Normal hearing subjects listening to a CI simulation (the newly developed GET vocoder) will perform the experiment using the localization method developed in the subproject "Loca Methods". Learning effects will be studied by obtaining visual feedback.


    Experiments are underway.


    Knowing the number of channels required to encode spectral cues for localization in the vertical planes is an important step in the development of a 3-D localization strategy for CIs. 


    FWF (Austrian Science Fund): Project #P18401-B15


    • Goupell, M., Majdak, P., and Laback, B. (2010). Median-plane sound localization as a function of the number of spectral channels using a channel vocoder, J. Acoust. Soc. Am. 127, 990-1001.
  • Numerics of Block Matrices


    During the current project of efficiently calculating a resynthesis window and an iterative scheme for a finite element method algorithm for vibrations in soils and liquids, it became apparent that block matrices are a powerful tool to find numerically efficient algorithms.


    In this project, the focus should be the investigation of the numeric features of block matrices. How can this structure be used to calculate or approximate the inverse of a matrix or its norm? How can this be used to speed up iterative schemes?


    The results will be used for the two projects mentioned below:

    • double preconditioning for Gabor frames
    • vibrations in random layers
  • Operators and Frames

    Basic Description:

    Practical experience quickly revealed that the concept of an orthonormal basis is not always useful. This led to the concept of frames. Models in physics and other application areas (for example sound vibration analysis) are mostly continuous models. Many continuous model problems can be formulated as operator theory problems, such as in differential or integral equations. Operators provide an opportunity to describe scientific models, and frames provide a way to discretize them.

    Sequences are often used in physical models, allowing numerically unstable re- synthesis. This can be called an "unbounded frame". How this inversion can be regularized is being investigated. For many applications, a certain frame is very useful in describing the model. Therefore, it is also beneficial to use the same sequence to find a discretization of involved operators.


    Frames in Finite Dimensional Spaces:

    In this project, the theory of frames in the finite discrete case is investigated further.

    Matrix Representation of Operators using Frames:

    The standard matrix description of operators using orthonormal bases is extended to the more general case of frames.

    Weighted and Controlled Frames:

    Weighted and controlled frames were introduced to speed up the inversion algorithm for the frame matrix of a wavelet frame. In this project, these kinds of frames are investigated further.

    Basic Properties of Unbounded Frames

    Irregular Frames of Translates:

    In this project, one function's sequences of irregular shifts are investigated.


    • S. Heineken, Research Group on Real and Harmonic Analysis, University of Buenos Aires
    • J. P. Antoine, Unité de physique théorique et de physique mathématique – FYMA
    • M. El-Gebeily,  Department of Mathematical Sciences, King Fahd University of Petroleum and Minerals, Saudi Arabia
  • Optimal Gaussian Mixture Model (GMM) Initialization for Speaker Modeling


    The modeling step in speaker detection has an enormous influence on the classification task, because the quality of the model depends on the parameters chosen in this step. False classifications, false identifications, and false verifications can result from malformed speaker models. The initial model parameters have an influence on the final determined parameters of the speaker models. To obtain optimized speaker models, different initialization methods are explored.


    Speaker models are represented as Gaussian Mixture Models (GMMs). These models are mixtures of multivariate distributions that are parameterized by the means and the co-variance matrices of the distributions and the mixture weights. The parameters are estimated by the expectation maximization algorithm (EM algorithm) which maximizes the likelihood in the model. Initial model parameters have to be selected for this algorithm. Different initial parameters can lead to a convergence of the algorithm in local maximums. The effect of different initialization methods on the identification rate is analyzed.


    Optimized speaker models reflect the speech behavior of the speakers in an optimal way. The inter-speaker variability is maximized while the intra-speaker variability is minimized by avoidance of malformed speaker models. The usage of optimal initialization methods improves the robustness and the reliability of automatic speaker identification and verification systems.

  • Orthobem: Simulation of Vibrations in Tunnels


    Methods to predict the propagation of vibrations in soil are relatively undeveloped. Reasons for this include the complexity of the wave propagation in soil and the insufficient knowledge of material parameters. During this project a method was developed to simulate the propagation of vibrations that are caused by a load at the base of a tunnel.


    When dealing with the model of a tunnel in a semi-infinite domain like soil, the boundary element method (BEM) seems to be an appropriate tool. Unfortunately it cannot be applied directly to layered orthotropic media, because of the lack of a closed form of the Greens function, which is essential for BEM. But by transforming the whole system into the Fourier domain with respect to space and time, it is possible to numerically construct an approximation for this function on a predefined grid. With this approximation the boundary integral equation, that describes the propagation of waves caused by a vibrating load at the base of a tunnel can be solved.


    Models that can help to predict the propagation of vibrations inside soil layers are of great interest in earthquake sciences or when constructing railway lines and tunnels.

  • PAAB


    Railway vehicles passing through tight curves can produce a high pitched noise called curve squeal. Curve squeal is a very salient type of noise located in the high frequency range that can range between a tonal narrow band and a wide band noise. The reason for the tonal noise is lateral creepage on the top of the rail, which excites wheel vibration at frequencies corresponding to their modes. Wide band noise, however, is caused by wheel flanges touching the rail.


    The project PAAB aims at investigating the effect on the perceived annoyance of such noises using in a perception test. Using the resulting perceptual characterization of curve squeal should aid in more adequately considering this type of noise in noise mapping.


    Based on previous conventional large-scale emission measurements as well as new measurements at immission distances using a head-and-torso-simulator representative samples for curve squeal will be derived and used in a perception test. This will also be aided by using synthetic well defined curve squeal noise.

    PAAB is funded by the FFG (project 860523) and the Austrian Federal Railways (ÖBB). The project is done in cooperation with the Research Center of Railway Engineering, Traffic Economics and Ropeways, Institute of Transportation, Vienna University of Technololgy (project leader), Kirisits Engineering Consultants, and psiacoustic Umweltforschung und Engineering GmbH.



  • Parkinson Speech

    Effects of the subthalamic stimulation on the characteristic of speech by parkinson patients.

  • PASS - Psychoacoustic Analysis of Noise Emissions Induced by Railway Traffic

    The project PASS, which is processed in cooperation with the IEW of the TU Vienna and psiacoustic GmbH, deals with the psychoacoustic evaluation of noise. The project is a continuation of the project RELSKG and deals with high and low noise barriers that are simulated with the 2.5 D boundary element method (BEM) assuming incoherent line sources. The comparison of the 2.5 D BEM with measurements resulted in a good agreement. Additionally measurements with rail dampers were taken into account in the psychoacoustic tests. The evaluation was done in two tests with 40 test persons. The first test determines the relative annoyance and the second the just noticeable difference in annoyance. The results ware that freight trains at the same A-level are less annoying than passenger trains and that at the same A-level the noise behind a noise barrier is a little bit more annoying than without a measure. The project started in 2013 and lasts until the end of 2014.