• Objective

    Computational models for speech production and analysis have been of research interest since the 1960s. Most models assume the vocal tract (VT) to be a segmented straight tube, but when pronouncing  nasals like /m/ and /n/ or nasalized vowels the nasal part of the vocal tract plays an important part and a single tube model is not feasible anymore. Thus, it  is necessary to consider a branched tube model that includes an additional tube model for the nasal tract. For these branched models, the estimation of the cross section area of each segments from a given signal is highly non trivial and in general requires the solution of a non-linear system of equations.


    The problem is overdetermined, and we have to add additional restrictions to our solution, for example restrictions on upper and lower bounds of the area functions or smoothness assumption about the vocal tract. To that end we introduced e.g. probabilistic methods (variational Bayes) into our model estimation.

  • Objective

    Railway tunnels avoid direct acoustic annoyance by railway traffic. However, vibrations from tunnels propagate through the soil and lead to disturbances by percieved low frequency vibrations.

    The objective of this project is to develop and implement a mathematical model that takes a moving vibrating load into account. Furthermore, the surrounding soil is modeled as an anisotropic material, consisting of layers with arbitrary orientation.



    The propagation of the vibrations inside the tunnel are modelled by the finite element method (FEM), where the superstructure of the tunnel and the railway are considered. Vibrations outside the tunnel, propagating through the (infiinite) soil are modelled by the boundary element method (BEM). For a detailed model of the whole system, both methods have to be coupled.

  • The rapid increase in available computing power and the fast evolution of audio interfacing and transmission technologies have led to a new age of immersive audio systems to reproduce spatial sound with surrounding loudspeakers. Many of these approaches require a precise and robust space-time-frequency analysis of sound fields. The joint project of ARI and IRCAM  combines the mathematical concepts provided by the ARI with the profound knowledge in real-time signal processing and acoustics of IRCAM. It addresses fundamental research questions in both fields and aims at developing improved methods for the target applications mentioned above.

    The main questions that his project aims at are:

    • Is it possible to apply the frame-based signal-processing tools to a predefined geometrical alignment of microphones and/or loudspeakers (e.g. to the 64-channel spherical microphone array that is currently under development at IRCAM
    • How can acoustic fields on the sphere (e.g. measured with a spherical microphone array) be represented with frames in order to have better control of the space-time-frequency resolutions on different parts of the sphere?
    • Is it possible to apply this multi-resolution space-time-frequency representation to room acoustic sensing with multichannel spherical microphone arrays (e.g. to measure the spatial distribution of early reflections with higher resolution than provided with spherical harmonic analysis)?
  • AABBA is an intellectual grouping collaborating on the applications and development of models of human binaural hearing

    AABBA's goal is to promote exploration and development of binaural models and their applications. AABBA members are academic scientists willing to participate in our activities. We  meet annually for an open discussion and progress presentation, especially encouraging to bring in students and young scientists associated with members’ projects to our meetings. Our activities consolidate in joint publications and special sessions at international conferences. As a relevant tangible outcome, we provide validated (source) codes for published models of binaural and spatial hearing to our collection of auditory models, known as the auditory modeling toolbox (AMT).


    • Executive board: Piotr Majdak, Armin Kohlrausch, Ville Pulkki

    • Members:

      • Aachen: Janina Fels, ITA, RWTH Aachen
      • Berlin: Klaus Obermayer, NI, TU Berlin
      • Bochum: Dorothea Kolossa & Jens Blauert, Ruhr-Universität Bochum
      • Cardiff: John Culling, School of Psychology, Cardiff University
      • Copenhagen: Torsten Dau & Tobias May, DTU, Lyngby
      • Dresden: Ercan Altinsoy, TU Dresden
      • Ghent: Sarah Verhulst, Ghent University
      • Guangzhou: Bosun Xie, South China University of Technology, Guangzhou
      • Helsinki: Ville Pulkki & Nelli Salminen, Aalto University
      • Ilmenau: Alexander Raake, TU Ilmenau
      • Kosice: Norbert Kopčo, Safarik University, Košice
      • Lyon: Mathieu Lavandier, Université de Lyon
      • Munich I: Werner Hemmert, TUM München
      • Munich II: Bernhard Seeber, TUM München 
      • Oldenburg: Bernd Meyer, Carl von Ossietzky Universität Oldenburg
      • Oldenburg-Eindhoven: Steven van de Par & Armin Kohlrausch, Universität Oldenburg
      • Patras: John Mourjopoulos, University of Patras
      • Rostock: Sascha Spors, Universität Rostock
      • Sheffield: Guy Brown, The University of Sheffield
      • Tabriz: Masoud Geravanchizadeh, University of Tabriz
      • Toulouse: Patrick Danès, Université de Toulouse
      • Troy: Jonas Braasch, Rensselaer Polytechnic Institute, Troy
      • Vienna: Bernhard Laback & Robert Baumgartner, Austrian Academy of Sciences, Wien
      • The AMT (Umbrella Project): Piotr Majdak
    AABBA Group 2018
    AABBA Group as of the 10th meeting 2018 in Vienna.


    Annual meetings are held at the beginning of each year:

    • 11th meeting: 19-20 February 2019, Vienna. Schedule.
    • 10th meeting: 30-31 January 2018, Vienna. Schedule.
    • 9th meeting: 27-28 February 2017, Vienna. Schedule.
    • 8th meeting: 21-22 January 2016, Vienna. Schedule.
    • 7th meeting: 22-23 February 2015, Berlin.
    • 6th meeting: 17-18 February 2014, Berlin.
    • 5th meeting: 24-25 January 2013, Berlin.
    • 4th meeting: 19-20 January 2012, Berlin.
    • 3rd meeting: 13-14 January 2011, Berlin.
    • 2nd meeting: 29-30 September 2009, Bochum.
    • 1st meeting: 23-26 March 2009, Rotterdam.


    • Special Session "Binaural models: development and applications" at the ICA 2019, Aachen.
    • Special Session "Models and reproducible research" at the Acoustics'17 (EAA/ASA) 2017, Boston.
    • Structured Session "Applied Binaural Signal Processing" at the Forum Acusticum 2014, Krakòw.
    • Structured Session "The Technology of Binaural Listening & Understanding" at the ICA 2016, Buenos Aires.

    Contact person: Piotr Majdak

  • Introduction:

    The ability of listeners to discriminate literal meanings from figurative language, affective language, or rhetorical devices such as irony is crucial for a successful social interaction. This discriminative ability might be reduced in listeners supplied with cochlear implants (CIs), widely used auditory prostheses that restore auditory perception in the deaf or hard-of-hearing. Irony is acoustically characterised by especially a lower fundamental frequency (F0), a lower intensity and a longer duration in comparison to literal utterances. In auditory perception experiments, listeners mainly rely on F0 and intensity values to distinguish between context-free ironic and literal utterances. As CI listeners have great difficulties in F0 perception, the use of frequency information for the detection of irony is impaired. However, irony is often additionally conveyed by characteristic facial expressions.


    The aim of the project is two-fold: The first (“Production”) part of the project will study the role of paraverbal cues in verbal irony of Standard Austrian German (SAG) speakers under well-controlled experimental conditions without acoustic context information. The second (“Perception”) part will investigate the performance in recognizing irony in a normal-hearing control group and a group of CI listeners.


    Recordings of speakers of SAG will be conducted. During the recording session, the participants will be presented with scenarios that evoke either a literal or an ironic utterance. The response utterances will be audio- and video-recorded. Subsequently, the thus obtained context-free stimuli will be presented in a discrimination test to normal-hearing and to postlingually deafened CI listeners in three modes: auditory only, auditory+visual, visual only.


    The results will not only provide information on irony production in SAG and on multimodal irony perception and processing, but will, most importantly, identify the cues that need to be improved in cochlear implants in order to allow CI listeners full participation in daily life.

  • Objective:

    Speaker models generated from training recordings of different speakers should differentiate between speakers. These models are estimated using feature vectors that are based on acoustic observations. So, the feature vectors should themselves show a high degree of inter-speaker variability and a low degree of intra-speaker variability.


    Cepstral coefficients of transformed short-time spectra (e.g. Mel-Frequency Cepstral Coefficients - MFCC) are experimentally developed features that are widely used in the domain of automatic speech and speaker detection. Because of the manifold possibilities of parameters for the feature extraction process and the lack of theoretically motivated explanations for the determination of the last-mentioned, only a stepwise investigation of the extraction process can lead to stable acoustic features.


    Optimized acoustic features for the representation of speakers enables the improvement of automatic speaker identification and verification. Additionally, the development of methods for forensic investigation of speakers (manually and automatically) is supported.


    Scientific and Technological Cooperation between Austria and Serbia (SRB 01/2018)

    Duration of the project: 01.07.2018 - 30.06.2020


    Project partners:

    Acoustics Research Institute, ÖAW (Austria)

    University of Vienna (Austria)

    University of Novi Sad (Republic of Serbia)


    Project website:

  • Objective:

    The generation of speaker models is based on acoustic features obtained from speech corpora. From a closed set of speakers, the target speaker has to be identified in an unsupervised identification task.


    Training and comparison recordings exist for every speaker. The training set is used to generate parametric speaker models (Gaussian Mixture Models – GMMs), while the test set is needed for the comparisons of all test recordings to all models. The model with the highest similarity (maximum likelihood) is chosen as the target speaker. The efficiency of the identification task is measured as the identification rate (i.e. the number of correctly chosen target speakers).


    Aside from biometric commercial applications, the forensic domain is another important field where speaker identification is used. Because speaker identification is a closed-set classification task, it is useful in cases where a target speaker has to be selected from a set of known speakers (e.g. in the case of hidden observations).

  • Objective:

    The offender recording (a speaker recorded at the scene of a crime) is verified by determining the similarity of the offender recording's typicality to a recording of a suspect.


    A universal background model (UBM) is generated via the training of a parametric Gaussian Mixture Model (GMM) that reflects the distribution of feature vectors in a reference population. Every comparison recording is used to derive a GMM from the UBM by adaptation of the model parameters. Similarity is measured through computation of the likelihoods of the offender recordings in the GMM while typicality is measured by computation of the likelihoods of the offender recordings in the UBM. The verification is expressed as the likelihood ratio of these likelihood values.


    While fully unsupervised automatic verification is performed with a binary decision using a likelihood ratio threshold and is used in biometric commercial applications, the usage of the likelihood ratio as an expression of the strength of the evidence in forensic speaker verification has become an important issue.

  • Objective:

    The beam forming method focuses an arbitrary receiver coil using time delay and amplitude manipulation, and adds to the temporal signal of the microphones or the short time Fourier transform.


    64 microphones are collected by a microphone array with arbitrary shape. For compatibility with acoustic holography, equal spacing and a grid with 8 x 8 microphones is used.


    Localization of sound sources on high speed trains is a typical application. The method is used to separate locations along the train and especially the height of different sound sources. Typical sound sources on high speed trains are rail-wheel contact sites and aerodynamic areas. The aerodynamic conditions occur at all heights, especially at the pantograph.

  • BiPhase:  Binaural Hearing and the Cochlear Phase Response

    Project Description

    While it is often assumed that our auditory system is phase-deaf, there is a body of literature showing that listeners are very sensitive to phase differences between spectral components of a sound. Particularly, for spectral components falling into the same perceptual filter, the so-called auditory filter, a change in relative phase across components causes a change in the temporal pattern at the output of the filter. The phase response of the auditory filter is thus important for any auditory tasks that rely on within-channel temporal envelope information, most notably temporal pitch or interaural time differences.

    Within-channel phase sensitivity has been used to derive a psychophysical measure of the phase response of auditory filters (Kohlrausch and Sanders, 1995). The basic idea of the widely used masking paradigm is that a harmonic complex whose phase curvature roughly mirrors the phase response of the auditory filter spectrally centered on the complex causes a maximally modulated (peaked) internal representation and, thus, elicits minimal masking of a pure tone target at the same center frequency. Therefore, systematic variation of the phase curvature of the harmonic complex (the masker) allows to estimate the auditory filter’s phase response: the masker phase curvature causing minimal masking reflects the mirrored phase response of the auditory filter.

    Besides the obvious importance of detecting the target in the temporal dips of the masker, particularly of the target is short compared to the modulation period of the masker (Kohlrausch and Sanders, 1995), there are several indications that fast compression in the cochlea is important to obtain the masker-phase effect (e.g., Carlyon and Datta, 1997; Oxenham and Dau, 2004). One indication is that listeners with sensorineural hearing impairment (HI), characterized by reduced or absent cochlear compression due to loss of outer hair cells, show only a very weak masker-phase effect, making it difficult to estimate the cochlear phase response.

    In the BiPhase project we propose a new paradigm for measuring the cochlear phase response that does not rely on cochlear compression and thus should be applicable in HI listeners. It relies on the idea that the amount of modulation (peakedness) in the internal representation of a harmonic complex, as given by its phase curvature, determines the listener’s sensitivity to envelope interaural time difference (ITD) imposed on the stimulus. Assuming that listener’s sensitivity to envelope ITD does not rely on compression, systematic variation of the stimulus phase curvature should allow to estimate the cochlear phase response both in normal-hearing (NH) and HI listeners. The main goals of BiPhase are the following:

    • Aim 1: Assessment of the importance of cochlear compression for the masker-phase effect at different masker levels. Masking experiments are performed with NH listeners using Schroeder-phase harmonic complexes with and without a precursor stimulus, intended to reduce cochlear compression by activation of the efferent system controlling outer-hair cell activity. In addition, a quantitative model approach is used to estimate the contribution of compression from outer hair cell activity and other factors to the masker-phase effect. The results are described in Tabuchi, Laback, Necciari, and Majdak (2016). A follow-up study on the dependency of the masker-phase effect on masker and target duration, the target’s position within the masker, the masker level, and the masker bandwidth and conclusions on the role of compression of underlying mechanisms in simultaneous and forward masking is underway.
    • Aim 2: Development and evaluation of an envelope ITD-based paradigm to estimate the cochlear phase response. The experimental results on NH listeners, complemented with a modeling approach and predictions, are described in Tabuchi and Laback (2017). This paper also provides model predictions for HI listeners.
      Besides the consistency of the overall pattern of ITD thresholds across phase curvatures with data on the masking paradigm and predictions of the envelope ITD model, an unexpected peak in the ITD thresholds was found for a negative phase curvature which was not predicted by the ITD model and is not found in masking data. Furthermore, the pattern of results for individual listeners appeared to reveal more variability than the masking paradigm. Data were also collected with an alternative method, relying on the extent of laterality of a target with supra-threshold ITD, as measured with an interaural-level-difference-based pointing stimulus. These data showed no nonmonotonic behavior at negative phase curvatures. Rather, they showed good correspondence with the ITD model prediction and more consistent results across individuals compared to the ITD threshold-based method (Zenke, Laback, and Tabuchi, 2016).
    • Aim 3: Development of a ITD-based method to account for potentially non-uniform curvatures of the phase response in HI listeners. Using two independent iterative approaches, NH listeners adjusted the phase of individual harmonics of an ITD-carrying complex so that it elicited maximum extent of laterality. Although the pattern of adjusted phases very roughly resembled the expected pattern, there was a large amount of uncertainty (Zenke, 2014), preventing the method from further use. Modified versions of the method will be considered in a future study.


    This project is funded by the Austrian Science Fund (FWF, Project # P24183-N24, awarded to Bernhard Laback). It run from 2013 to 2017


    Peer-reviewed papers

    • Tabuchi, H. and Laback, B. (2017): Psychophysical and modeling approaches towards determining the cochlear phase response based on interaural time differences, The Journal of the Acoustical Society of America 141, 4314–4331.
    • Tabuchi, H., Laback, B., Necciari, T., and Majdak, P (2016). The role of compression in the simultaneous masker phase effect, The Journal of the Acoustical Society of America 140, 2680-2694.

    Conference talks

    • Tabuchi, H., Laback, B., Majdak, P., and Necciari, T. (2014). The role of precursor in tone detection with Schroeder-phase complex maskers. Poster presented at 37th Association for Research in Otolaryngology (ARO) Meeting, San Diego, California.
    • Tabuchi, H., Laback, B., Majdak, P., and Necciari, T. (2014). The perceptual consequences of a precursor on tone detection with Schroeder-phase harmonic maskers. Invited talk at Alps Adria Acoustics Association, Graz, Austria.
    • Tabuchi, H., Laback, B., Majdak, P., Necciari, T., and Zenke,K. (2015). Measuring the auditory phase response based on interaural time differences. Talk at 169th Meeting of the Acoustical Society of America, Pittsburgh, Pennsylvania.
    • Zenke, K., Laback, B., and Tabuchi, H. (2016). Towards an Efficient Method to Derive the Phase Response in Hearing-Impaired Listeners. Talk at 37th Association for Research in Otolaryngology (ARO) Meeting, San Diego, California.
    • Tabuchi, H., Laback, B., Majdak, P., Necciari, T., and Zenke, K. (2016). Modeling the cochlear phase response estimated in a binaural task. Talk at 39th Association for Research in Otolaryngology (ARO) Meeting, San Diego, California.
    • Laback, B., and Tabuchi, H. (2017). Psychophysical and modeling approaches towards determining the cochlear phase response based on interaural time differences. Invited Talk at AABBA Meeting, Vienna, Austria.
    • Laback, B., and Tabuchi, H. (2017). Psychophysical and Modeling Approaches towards determining the Cochlear Phase Response based on Interaural Time Differences. Invited Talk at 3rd Workshop “Cognitive neuroscience of auditory and cross-modal perception, Kosice, Slovakia.


    • Carlyon, R. P., and Datta, A. J. (1997). "Excitation produced by Schroeder-phase complexes: evidence for fast-acting compression in the auditory system," J Acoust Soc Am 101, 3636-3647.
    • Kohlrausch, A., and Sander, A. (1995). "Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets," J Acoust Soc Am 97, 1817-1829.
    • Oxenham, A. J., and Dau, T. (2004). "Masker phase effects in normal-hearing and hearing-impaired listeners: evidence for peripheral compression at low signal frequencies," J Acoust Soc Am 116, 2248-2257.

    See also


  • Projektleitung: Michael Pucher

    Beginn des Projekts: 1. Februar 2019


    Um den aktuellen Zustand einer Sprache zu erheben, soll bekanntlich der Sprachgebrauch eines alten, ländlichen, nicht mobilen Mannes analysiert werden. Für Entwicklungstendenzen einer Varietät sollte man jedoch die Sprache einer jungen und gebildeten Frau im urbanen Bereich untersuchen. Der Sprachgebrauch von jungen Frauen stellt ein besonders interessantes Forschungsfeld dar: Sie gelten als Initiatoren und Treibkräfte linguistischer Neuheiten einer Sprache, lautlich wie lexikal, die sich von Großstädten aus in den weiteren Sprachraum verbreiten können. Ebenso wird angenommen, dass aufgeschlossene junge Frauen linguistische Innovationen rascher übernehmen als ihre männlichen Peers. Sie verleiben sich eine neue Art zu sprechen schneller ein und geben diese an ihre späteren Kinder weiter. Frauen tendieren auch dazu, sprachliche Merkmale als social identifier zu verwenden, um sich der gleichen Peergroup zugehörig zu zeigen und können dadurch zu einem Sprachwandel beitragen.

    Die Stadt Wien hat sich in den vergangenen 30 Jahren stark verändert; so ist die Bevölkerung um 15% gestiegen und mit ihr auch die Anzahl der gesprochenen Sprachen. Laut einer Erhebung der Arbeiterkammer werden in Wien ca. 100 verschiedene Sprachen verwendet und man kann Wien nicht absprechen, weiterhin als ein Schmelztiegel verschiedenster Sprachen und Kulturen in Mitteleuropa zu gelten. Dass sich diese gesellschaftlichen bzw. gesellschaftspolitischen Veränderungen nicht nur im lexikalischen Sprachgebrauch der WienerInnen widerspiegeln, sondern ebenso in ihrer physiologischen Stimme zum Ausdruck kommen, soll hier den Ausgangspunkt der Studie darstellen.

    In dieser Untersuchung wird die Stimme als der physiologische und im Vokaltrakt modulierter Schall zur Lautäußerungen des Menschen gesehen. Die Stimme kann abgesehen davon auch als Ort des verkörperlichten Herz der gesprochenen Sprache gelten, die den Körper durch Indexikalität im sozialen Raum verankert. Als Vehikel der persönlichen Identität kann die Stimme nicht nur soziokulturelle, sondern auch gesellschaftspolitische Merkmale (bspw. „Frauen in Führungspositionen haben eine tiefere Stimme“) widerspiegeln. Hier übernimmt die Soziophonetik eine tragende Rolle, denn sie stellt ein wichtiges Instrument dar, das es ermöglicht, den sozialen Raum und seine gesellschaftsrelevanten Diskurse mit dem Individuum zu verknüpfen.

    Studien aus dem angloamerikanischen Raum wie legen nahe, dass sich die Stimme der jungen Frau in einem Wandel befindet. Das soziophonetische Stimmphänomen Vocal Fry hat sich inzwischen im angloamerikanischen Raum zum prominenten Sprachmerkmal junger, gebildeter und urbanen Frauen entwickelt.

    Basierend auf zwei Korpora soll eine Longitudinalstudie entstehen, die nachskizziert, inwiefern sich die Stimme der jungen Wienerin geändert hat. Soziophonetische Studien zu Frauenstimmen gibt es in Österreich nicht, vor allem in Hinsicht auf die angestrebte Qualität der Studie. Durch ihren longitudinalen Charakter kann sie aufzeigen, in wie weit das gesellschaftliche Geschehen Einfluss auf die Stimme der Frau ausübt.

    Darüber hinaus bietet diese Studie eine einmalige Gelegenheit, eine Momentaufnahme der Wienerin und ihrer Stimme zu erhalten und sie in einen historischen Kontext zu setzen.

  • Objective:

    ExpSuite is a program that compiles the implementation of psychoacoustic experiments. ExpSuite is the name of a framework that is used as a basis for an application. It can be enlarged with customized and experiment-dependent methods (applications). The framework consists of a user-interface (experimentator-and-subject interface), signal processing modules (off-line and in real-time), and input-output modules.

    The user-interface is implemented in Visual Basic.NET and benefits from the "Rapid Application Development" environment, which develops experiments quickly. To compensate for the sometimes slow processing performance of VB, the stimulation signals can be processed in a vector-oriented way using a direct link to MATLAB. Because of the direct link to MATLAB, numerous MATLAB intern functions are available to the ExpSuite applications.

    The interface accessible to the people administering the tests contains several templates that can be chosen for a specific experiment. Either the keyboard, mouse, joypad, or joystick can be chosen as the input device. The user interface is designed for dual screen equipment, and allows a permanent surveillance of the experiment status on the same computer. Additionally, the transmission of the current experiment status to another computer is possible via a network connection.The framework supports two types of stimulation:

    • the standard acoustic stimulation using an audio interface for experiments with normal or impaired hearing subjects, and
    • the direct electric stimulation of cochlear implants for experiments with cochlear implant listeners.
  • START project of P. Balazs.



    This international, multi-disciplinary and team-oriented project will expand the group Mathematics and Acoustical Signal Processing at the Acoustic Research Institute in cooperation with NuHAG Vienna (Hans G. Feichtinger, M. Dörfler, K. Gröchenig), Institute of TelecommunicationVienna (Franz Hlawatsch), LATP Marseille (Bruno Torrésani) LMA (Richard Kronland-Martinet). CAHR (Torsten Dau, Peter Soendergaard), the FYMA Louvain-la-Neuve (Jean-Pierre Antoine), AG Numerics (Stephan Dahlke), School of Electrical Engineering and Computer Science (Damian Marelli) as well as the BKA Wiesbaden (Timo Becker).

    Within the institute the groups Audiological Acoustics and Psychoacoutics, Computational Acoustics, Acoustic Phonetics and Software Development are involved in the project.

    This project is funded by the FWF as a START price . It is planned to run from May 2012 to April 2018.






    General description:

    We live in the age of information where the analysis, classification, and transmission of information is f essential importance. Signal processing tools and algorithms form the backbone of important technologieslike MP3, digital television, mobile phones and wireless networking. Many signal processing algorithms have been adapted for applications in audio and acoustics, also taking into account theproperties of the human auditory system.

    The mathematical concept of frames describes a theoretical background for signal processing. Frames are generalizations of orthonormal bases that give more freedom for the analysis and modificationof information - however, this concept is still not firmly rooted in applied research. The link between the mathematical frame theory, the signal processing algorithms, their implementations andfinally acoustical applications is a very promising, synergetic combination of research in different fields.

    Therefore the main goal of this multidisciplinary project is to

    -> Establish Frame Theory as Theoretical Backbone of Acoustical Modeling

    in particular in psychoacoustics, phonetic and computational acoustics as well as audio engineering.



    For this auspicious connection of disciplines, FLAME will produce substantial impact on both the heory and applied research.

    The theory-based part of FLAME consists of the following topics:

    • T1 Frame Analysis and Reconstruction Beyond Classical Approaches
    • T2 Frame Multipliers, Extended
    • T3 Novel Frame Representation of Operators Motivated by Computational Acoustics

    The application-oriented part of FLAME consists of:

    • A1 Advanced Frame Methods for Perceptual Sparsity in the Time-Frequency Plane
    • A2 Advanced Frame Methods for the Analysis and Classification of Speech
    • A3 Advanced Frame Methods for Signal Enhancement and System Estimation

    Press information:




  • Scientific and Technological Cooperation with Macedonia 2016-18
    Project duration: 01.07.2016 – 30.06.2018

    The main aim of the project is to combine the research areas of Frame Theory and Generalized Asymptotic Analysis.

    Project partner institutions:
    Acoustics Research Institute (ARI), Austrian Academy of Sciences, Vienna, Austria
    Ss. Cyril and Methodius University, Skopje, The Former Yugoslav Republic of Macedonia

    Project members:
    Diana T. Stoeva (Project coordinator Austria), Peter Balazs, Nicki Holighaus, Zdenek Prusa
    Katerina Hadzi-Velkova Saneva (Project coordinator FYROM), Sanja Atanasova, Pavel Dimovski, Zoran Hadzi-Velkov, Bojan Prangoski, Biljana Stanoevska-Angelova, Daniel Velinov, Jasmina Veta Buralieva

    Project Workshops and Activities:

    1) Nov. 24-26, 2016, Ss. Cyril and Methodius University, Skopje

    Project Kickoff-workshop

    Program of the workshop

    2) Nov. 15-19, 2017, ARI, Vienna

    Research on project-related topics

    3) April 14-19, 2018, ARI, Vienna

    Research on project-related topics


    ARI-Guest-Talk given at ARI on the 17th of April, 2018: Prof. Zoran Hadzi-Velkov, "The Emergence of Wireless Powered Communication Networks"

    4) May 25-30, Ss. Cyril and Methodius University, Skopje

    Research on project-related topics


    Workshop "Women in mathematics in the Balkan region" (May 28 - May 29, Ss. Cyril and Methodius University, Skopje)

    5) June 14-18, Ss. Cyril and Methodius University, Skopje

    Research on project-related topics


    Summer course "An Introduction to Frame Theory and the Large Time/Frequency Analysis Toolbox" (June 14-15), Lecturers: Diana Stoeva and Zdenek Prusa (from ARI)

    6) Mini-Symposium "Frame Theory and Asymptotic Analysis" organized at the European Women in Mathematics General Meeting 2018, Karl-Franzens-Universität Graz, Austria, 3-7 September 2018.

    Link to Conference website

    7) November 17-20, 2018, ARI, Vienna

    Work on project-related topics




  • Objective:

    Up to now a boundary element method based on the collocation method was combined with the Fast Multipole Method to speed up the numerical calculation.


    This approach was chosen, because the collocation is a fast method to build up the matrix and the Fast Multipole Method (FMM)is a fast method to solve large matrices. For compatibility with the software HAMS the FMM has to be ported to a Galerkin based Boundary Element (BE) approach.

  • General Information

    Funded by the Vienna Science and Technology Fund (WWTF) within the  "Mathematics and …2016"  Call (MA16-053)

    Principal Investigator: Georg Tauböck

    Co-Principal Investigator: Peter Balazs

    Project Team: Günther Koliander, José Luis Romero  

    Duration: 01.07.2017 – 01.07.2021


    Signal processing is a key technology that forms the backbone of important developments like MP3, digital television, mobile communications, and wireless networking and is thus of exceptional relevance to economy and society in general. The overall goal of the proposed project is to derive highly efficient signal processing algorithms and to tailor them to dedicated applications in acoustics. We will develop methods that are able to exploit structural properties in infinite-dimensional signal spaces, since typically ad hoc restrictions to finite dimensions do not sufficiently preserve physically available structure. The approach adopted in this project is based on a combination of the powerful mathematical methodologies frame theory (FT), compressive sensing (CS), and information theory (IT). In particular, we aim at extending finite-dimensional CS methods to infinite dimensions, while fully maintaining their structure-exploiting power, even if only a finite number of variables are processed. We will pursue three acoustic applications, which will strongly benefit from the devised signal processing techniques, i.e., audio signal restoration, localization of sound sources, and underwater acoustic communications. The project is set up as an interdisciplinary endeavor in order to leverage the interrelations between mathematical foundations, CS, FT, IT, time-frequency representations, wave propagation, transceiver design, the human auditory system, and performance evaluation.


    compressive sensing, frame theory, information theory, signal processing, super resolution, phase retrieval, audio, acoustics




  • Objective:

    General frame theory can be more specialized if a structure is imposed on the elements of the frame in question. One possible, very natural structure is sequences of shifts of the same function. In this project, irregular shifts are investigated.


    In this project, the connection to irregular Gabor multipliers will be explored. Using the Kohn Nirenberg correspondence, the space spanned by Gabor multipliers is just a space spanned by translates. Furthermore, the special connection of the Gramian function and the Grame matrix for this case will be investigated.


    A typical example of frames of translates is filter banks, which have constant shapes. For example, the phase vocoder corresponds to a filter bank with regular shifts. Introducing an irregular shift gives rise to a generalization of this analysis / synthesis system.


    • S. Heineken, Research Group on Real and Harmonic Analysis, University of Buenos Aires
  • Bilateral Cochlear Implants: Physiology and Psychophysics

    Current cochlear implants (CIs) are very successful in restoring speech understanding in individuals with profound or complete hearing loss by electrically stimulating the auditory nerve. However, the ability of CI users to localize sound sources and to understand speech in complex listening situations, e.g. with interfering speakers, is dramatically reduced as compared to normal (acoustically) hearing listeners. From acoustic hearing studies it is known that interaural time difference (ITD) cues are essential for sound localization and speech understanding in noise. Users of current bilateral CI systems are, however, rather limited in their ability to perceive salient ITDs cues. One particular problem is that their ITD sensitivity is especially low when stimulating at relatively high pulses rates which are required for proper encoding of speech signals.  

    In this project we combine psychophysical studies in human bilaterally implanted listeners and physiological studies in bilaterally implanted animals to find ways in order to improve ITD sensitivity in electric hearing. We build on the previous finding that ITD sensitivity can be enhanced by introducing temporal jitter (Laback and Majdak, 2008) or short inter-pulse intervals (Hancock et al., 2012) in high-rate pulse sequences. Physiological experiments, performed at the Eaton-Peabody Laboratories Neural Coding Group (Massachusetts Eye and Ear Infirmary, Harvard Medical School, PI: Bertrand Delgutte), are combined with matched psychoacoustic experiments, performed at the EAP group of ARI (PI: Bernhard Laback). The main project milestones are the following:

    ·        Aim 1: Effects of auditory deprivation and electric stimulation through CI on neural ITD sensitivity. In physiological experiments it is studied if chronic CI stimulation can reverse the effect of neonatal deafness on neural ITD sensitivity.

    ·        Aim 2: Improving the delivery of ITD information with high-rate strategies for CI processors.

      A. Improving ITD sensitivity at high pulse rates by introducing short inter-pulse intervals

      B. Using short inter-pulse intervals to enhance ITD sensitivity with “pseudo-syllable” stimuli.

    Co-operation partners:

    ·        External: Eaton-Peabody Laboratories Neural Coding Group des Massachusetts Eye and Ear Infirmary an der Harvard Medical School (PI: Bertrand Delgutte)

    ·        Internal: Mathematics and Signal Processing for Acoustics


    ·     This project is funded by the National Institute of Health (NIH).

    ·     It is planned to run from 2014 to 2019.

    Press information:

    ·     Article in DER STANDARD:

    ·     Article in DIE PRESSE:

    ·     OEAW website:


    See Also

    ITD MultEl

  • The aim of this project is to maintain the experimental facilities in our institute's laboratory.

    The lab consists of four testing places:

    • GREEN and BLUE: Two sound-booths (IAC-1202A) are used for audio recording and psychoacoustic testing performed with headphones. Each of the booths is controlled from outside by a computer. Two bidirectional audio channels with sampling rates up to 192 kHz are available.
    • RED: A visually-separated corner can be used for experiments with cochlear implant listeners. A computer controls the experimental procedure using a bilateral, direct-electric stimulation.
    • YELLOW: A semi-anechoic room, with a size of 6 x 6 x 3 m, can be used for acoustic tests and measurements in a nearly-free field. As many as 24 bidirectional audio channels, virtual environments generated by a head mounted display, and audio and video surveillance are available for projects like HRTF measurement, localization tests or acoustic holography.

    The rooms are not only used for measurements and experiments, also the Acoustics Phonetics group is doing speech recordings for dialect research and speaker identification, for example for survey reports. The facilities are also used to detect psychoacoustical validations.

    During the breaks in experiments, the subjects can use an Internet terminal or relax on a couch while sipping hot coffee...