• Introduction                                                                                                                                                   

    Rumble strips are (typically periodic) grooves place at the side of the road. When a vehicle passes over a rumble strip the noise and vibration in the car should alert the driver of the imminent danger of running off the road. Thus, rumble strips have been shown to have a positive effect on traffic safety. Unfortunately, the use of rumble strips in the close vicinity of populated areas is problematic due to the increased noise burden.


    The aim of the project LARS (LärmArme RumpelStreifen or low noise rumble strips) was to find rumble strip designs that cause less noise in the environment without significantly affecting the alerting effect inside the vehicle. For this purpose, a number of conventional designs as well as three alternative concepts were investigated: conical grooves to guide the noise under the car, pseudo-random groove spacing to reduce tonality and thus annoyance, as well as sinusoidal depth profiles which should produce mostly vibration and only little noise and which are already used in practice.


    Two test tracks were established covering a range of different milling patterns in order to measure the effects of rumble strips for a car and a commercial vehicle running over them. Acoustic measurements using microphones and a head-and-torso-simulator were done inside the vehicle as well as in the surroundings of the track. Furthermore, the vibration of the steering wheel and the driver seat were measured. Using the acoustics measurements, synthetic rumble strip noises were produced, in order to get a wider range of possible rumble strip designs than by pure measurements.

    Perception tests with 16 listeners were performed where the annoyance of the immissions as well as the urgency and reaction times for the sounds generated in the interior were determined also using the synthetic stimuli.

    LARS was funded by the FFG (project 840515) and the ASFINAG. The project was done in cooperation with the Research Center of Railway Engineering, Traffic Economics and Ropeways, Institute of Transportation, Vienna University of Technology, and ABF Strassensanierungs GmbH.

  • Objective:

    The aim of this study is to investigate the phonetics of second language acquisition and first language attrition, based on the acoustic and articulatory lateral realizations of Bosnian migrants living in Vienna. Bosnian has two lateral phonemes (a palatalized and an alveolar/velarized one), whereas Standard Austrian German features only one lateral phoneme (an alveolar lateral). In the Viennese dialect however, this phoneme also has a velarized variant.

    This phonetic investigation will be conducted with respect to the influence of language contact between Bosnian and SAG, and Bosnian and the Viennese dialect, as well as concerning the influence of gender and identity construction.


    The recordings will be conducted with female and male Bosnian speakers, aged between 20 and 35 years at the time of emigration, who came to Vienna during the Bosnian war 1992-1995. Additionally, control groups of monolingual L1 speakers of Bosnian, SAG and Vd will be recorded. All recordings will include reading tasks in order to elicit controlled speech, as well as spontaneous speech in the form of biographical interviews. The analyses will comprise quantitative and qualitative aspects. Quantitatively, the acoustic parameters formant frequencies (especially F2 and F3), duration and intensity of the laterals and their phonetic surrounding will be analyzed. Additionally, articulatory analyses will be performed using EPG and UTI data. Qualitatively, biographical information, language attitudes and social networks will be analysed in order to obtain information about speaker-specific or group-specific characteristics.


    The results of this study are relevant to understanding the processes of sound-realization and sound-change in the domains of language contact (phonetic processes in second language acquisition and first language attrition), sociolinguistics, and the sociology of identity construction

  • Objective:

    Head-related transfer functions (HRTFs) describe sound transmission from the free field to a place in the ear canal in terms of linear time-invariant systems. They contain spectral and temporal features that vary according to the sound direction. Differences among subjects requires the measuring of subjects' individual HRTFs for studies on localization in virtual environments. In this project, a system for HRTF measurement was developed and installed in the semi-anechoic room at the Austrian Academy of Sciences.


    Measurement of an HRTF was considered a system identification of the electro-acoustic chain: sound source-room-HRTF-microphone. The sounds in the ear canals were captured using in-ear microphones. The direction of the sound source was varied horizontally by rotating the subject on a turntable, and vertically by accessing one of the 22 loudspeakers positioned in the median plane. An optimized form of system identification with sweeps, the multiple exponential sweep method (MESM), was used for the measurement of transfer functions with satisfactory signal-to-noise ratios occurring within a reasonable amount of time. Subjects' positions were tracked during the measurement to ensure sufficient measurement accuracy. Measurement of headphone transfer functions was included in the HRTF measurement procedure. This allows equalization of headphone influence during the presentation of virtual stimuli.


    Multi-channel audio equipment has been installed in the semi-anechoic room, giving access to recording and stimuli presentation via 24 channels simultaneously.

    The multiple exponential sweep method was developed, allowing fast transfer function measurement of weakly non-linear time invariant systems for multiple sources.

    The measurement procedure was developed and a database of HRTFs was created. Until now, HRTF data for over 20 subjects had not been available to create virtual stimuli and present them via headphones.

    To virtually position sounds in space, the HRTFs are used for filtering free-field sounds. This results in virtual acoustic stimuli (VAS). To create VAS and present them via headphones, applications called Virtual Sound Positioning (VSP) and Loca (Part of our ExpSuite Software Project) have been implemented. It allows virtual sound positioning in a free-field environment using both stationary and moving sound sources

  • Objective:

    The modeling step in speaker detection has an enormous influence on the classification task, because the quality of the model depends on the parameters chosen in this step. False classifications, false identifications, and false verifications can result from malformed speaker models. The initial model parameters have an influence on the final determined parameters of the speaker models. To obtain optimized speaker models, different initialization methods are explored.


    Speaker models are represented as Gaussian Mixture Models (GMMs). These models are mixtures of multivariate distributions that are parameterized by the means and the co-variance matrices of the distributions and the mixture weights. The parameters are estimated by the expectation maximization algorithm (EM algorithm) which maximizes the likelihood in the model. Initial model parameters have to be selected for this algorithm. Different initial parameters can lead to a convergence of the algorithm in local maximums. The effect of different initialization methods on the identification rate is analyzed.


    Optimized speaker models reflect the speech behavior of the speakers in an optimal way. The inter-speaker variability is maximized while the intra-speaker variability is minimized by avoidance of malformed speaker models. The usage of optimal initialization methods improves the robustness and the reliability of automatic speaker identification and verification systems.

  • Objective:

    Methods to predict the propagation of vibrations in soil are relatively undeveloped. Reasons for this include the complexity of the wave propagation in soil and the insufficient knowledge of material parameters. During this project a method was developed to simulate the propagation of vibrations that are caused by a load at the base of a tunnel.


    When dealing with the model of a tunnel in a semi-infinite domain like soil, the boundary element method (BEM) seems to be an appropriate tool. Unfortunately it cannot be applied directly to layered orthotropic media, because of the lack of a closed form of the Greens function, which is essential for BEM. But by transforming the whole system into the Fourier domain with respect to space and time, it is possible to numerically construct an approximation for this function on a predefined grid. With this approximation the boundary integral equation, that describes the propagation of waves caused by a vibrating load at the base of a tunnel can be solved.


    Models that can help to predict the propagation of vibrations inside soil layers are of great interest in earthquake sciences or when constructing railway lines and tunnels.

  • Introduction

    Railway vehicles passing through tight curves can produce a high pitched noise called curve squeal. Curve squeal is a very salient type of noise located in the high frequency range that can range between a tonal narrow band and a wide band noise. The reason for the tonal noise is lateral creepage on the top of the rail, which excites wheel vibration at frequencies corresponding to their modes. Wide band noise, however, is caused by wheel flanges touching the rail.


    The project PAAB aims at investigating the effect on the perceived annoyance of such noises using in a perception test. Using the resulting perceptual characterization of curve squeal should aid in more adequately considering this type of noise in noise mapping.


    Based on previous conventional large-scale emission measurements as well as new measurements at immission distances using a head-and-torso-simulator representative samples for curve squeal will be derived and used in a perception test. This will also be aided by using synthetic well defined curve squeal noise.

    PAAB is funded by the FFG (project 860523) and the Austrian Federal Railways (ÖBB). The project is done in cooperation with the Research Center of Railway Engineering, Traffic Economics and Ropeways, Institute of Transportation, Vienna University of Technololgy (project leader), Kirisits Engineering Consultants, and psiacoustic Umweltforschung und Engineering GmbH.



  • Objective:

    Numerous implementations and algorithms for time frequency analysis can be found in literature or on the internet. Most of them are either not well documented or no longer maintained. P. Soendergaard started to develop the Linear Time Frequency Toolbox for MATLAB. It is the goal of this project to find typical applications of this toolbox in acoustic applications, as well as incorporate successful, not-yet-implemented algorithms in STx.


    The linear time-frequency toolbox is a small open-source Matlab toolbox with functions for working with Gabor frames for finite sequences. It includes 1D Discrete Gabor Transform (sampled STFT) with inverse. It works with full-length windows and short windows. It computes the canonical dual and canonical tight windows.


    These algorithms are used for acoustic applications, like formants, data compression, or de-noising. These implementations are compared to the ones in STx, and will be implemented in this software package if they improve its performance.


    • H. G. Feichtinger et al., NuHAG, Faculty of Mathematics, University of Vienna
    • B. Torrèsani, Groupe de Traitement du Signal, Laboratoire d'Analyse Topologie et Probabilités, LATP/ CMI, Université de Provence, Marseille
    • P. Soendergaard, Department of Mathematics, Technical University of Denmark
  • The spatially oriented format for acoustics (SOFA) is dedicated to store all kinds of acoustic informations related to a specified geometrical setup. The main task is to describe simple HRTF measurements, but SOFA also aims to provide the functionality to store measurements of something fancy like BRIRs with a 64-channel mic-array in a multi-source excitation situation or directivity measurement of a loudspeaker. The format is intended to be easily extendable, highly portable, and actually the greatest common divider of all publicly available HRTF databases at the moment of writing.

    SOFA defines the structure of data and meta data and stores them in a numerical container. The data description will be a hierarchical description when coming from free-field HRTFs (simple setup) and going to more complex setups like mic-array measurements in reverberant spaces, excited by a loudspeaker array (complex setup). We will use global geometry description (related to the room), and local geometry description (related to the listener/source) without limiting the number of acoustic transmitters and receivers. Room descriptions will be available by linking a CAD file within SOFA. Networking support will be provided as well allowing to remotely access HRTFs and BRIRs from client computers.

    SOFA is being developed by many contributors worldwide. The development is coordinated at ARI by Piotr Majdak.

    Further information:
  • Baumgartner et al. (2017a)

    Räumliches Hören ist wichtig, um die Umgebung ständig auf interessante oder gefährliche Geräusche zu überwachen und gezielt die Aufmerksam auf sie richten zu können. Die räumliche Trennung der beiden Ohren und die komplexe Geometrie des menschlichen Körpers liefern akustische Information über den Ort einer Schallquelle. Je nach Schalleinfallsrichtung verändert v.a. die Ohrmuschel das Klangspektrum, bevor der Schall das Trommelfell erreicht. Da die Ohrmuschel sehr individuell geformt ist (mehr noch als ein Fingerabdruck), ist auch deren Klangfärbung sehr individuell. Für die künstliche Erzeugung realistischer Hörwahrnehmungen muss diese Individualität so präzise wie nötig abgebildet werden, wobei bisher nicht geklärt ist, was wirklich nötig ist. SpExCue hat deshalb nach elektrophysiologischen Maßen und Vorhersagemodellen geforscht, die abbilden können, wie räumlich realistisch („externalisiert“) eine virtuelle Quelle empfunden wird.

    Da künstliche Quellen vorzugsweise im Kopf wahrgenommen werden, eignete sich die Untersuchung dieser Klangspektren zugleich zur Erforschung einer Verzerrung in der Hörwahrnehmung: Schallereignisse, die sich dem Hörer annähern, werden intensiver wahrgenommen als jene, die sich vom Hörer entfernen. Frühere Studien zeigten diese Verzerrung ausschließlich durch Lautheitsänderungen (zunehmende/abnehmende Lautheit wurde verwendet um sich nähernde/entfernende Schallereignisse zu simulieren). Es war daher unklar, ob die Verzerrung wirklich auf Wahrnehmungsunterschiede gegenüber der Bewegungsrichtung oder nur auf die unterschiedlichen Lautstärken zurück zu führen sind. Unsere Studie konnte nachweisen, dass räumliche Änderungen der Klangfarbe diese Verzerrungen (auf Verhaltensebene und elektrophysiologisch) auch bei gleichbleibender Lautstärke hervorrufen können und somit von einer allgemeinen Wahrnehmungsverzerrung auszugehen ist.

    Des Weiteren untersuchte SpExCue, wie die Kombination verschiedener räumlicher Hörinformation die Aufmerksamkeitskontrolle in einer Spracherkennungsaufgabe mit gleichzeitigen Sprechern, wie z.B. bei einer Cocktailparty, beeinflusst. Wir fanden heraus, dass natürliche Kombinationen räumlicher Hörinformation mehr Gehinraktivität in Vorbereitung auf das Testsignal herrufen und dadurch die neurale Verarbeitung der zu folgenden Sprache optimiert wird.

    SpExCue verglich außerdem verschiedene Ansätze von Berechnungsmodellen, die darauf abzielen, die räumliche Wahrnehmung von Klangänderungen vorherzusagen. Obwohl viele frühere experimentelle Ergebnisse von mindestens einem der Modellansätze vorhergesagt werden konnten, konnte keines von ihnen all diese Ergebnisse erklären. Um das zukünftige Erstellen von allgemeingültigeren Berechnungsmodellen für den räumlichen Hörsinn zu unterstützen, haben wir abschließend ein konzeptionelles kognitives Modell dafür entwickelt.


    Erwin-Schrödinger Fellowship from Austrian Science Funds (FWF, J3803-N30) awarded to Robert Baumgartner. Duration: May 2016 - November 2017.

    Follow-up funding provided by Oculus VR, LLC, since March 2018. Project Investigator: Robert Baumgartner.


    • Baumgartner, R., Reed, D.K., Tóth, B., Best, V., Majdak, P., Colburn H.S., Shinn-Cunningham B. (2017a): Asymmetries in behavioral and neural responses to spectral cues demonstrate the generality of auditory looming bias, in: Proceedings of the National Academy of Sciences of the USA 114, 9743-9748. (article)
    • Baumgartner, R., Majdak, P., Colburn H.S., Shinn-Cunningham B. (2017b): Modeling Sound Externalization Based on Listener-specific Spectral Cues, presented at: Acoustics ‘17 Boston: The 3rd Joint Meeting of the Acoustical Society of America and the European Acoustics Association. Boston, MA, USA. (conference)
    • Deng, Yuqi, Choi, Inyong, Shinn-Cunningham, Barbara G., Baumgartner, Robert (2019): Impoverished auditory cues fail to engage brain networks controlling spatial selective attention, in: bioRxiv, 533117. (preprint)
    • Majdak, Piotr, Baumgartner, Robert, Jenny, Claudia (2019): Formation of three-dimensional auditory space, in: arXiv:1901.03990 [q-bio]. (preprint)
  • Objective:

    In the past a FWF project dealing with the basics of Stochastic Transformation Methods was executed at the ARI. Explicitly the Karhunen Loeve Expansion and the Transformation of a polynomial Chaos were applied in the wave number domain. The procedure is based on the assumption of Gaussian distributed variables. This assumption shall be generalized to arbitrary random variables.


    The assumption of a wave number domain limits the model to a horizontally layered half space. This limitation shall be overcome by Wavelets kernels in the transformation instead of Fourier kernels. The aim is the possibility to calculated one sided statistical distributions for the physical parameters and arbitrary boundaries with the new method.

  • Multilateral Scientific and Technological Cooperation in the Danube Region 2017-2018
    Austria, Czech Republic, Republic of Serbia, and Slovak Republic
    Project duration: 01.01.2017 - 31.12.2018

    Project website:

  • Projektteil 02 des Sonderforschungsbereichs Deutsch in Österreich. Variation - Kontakt - Perzeptionfinanziert vom FWF (FWF6002) in Kooperation mit der Universität Salzburg

    Projektleitung: Stephan Elspaß, Hannes Scheutz, Sylvia Moosmüller

    Beginn des Projekts: 1. Jänner 2016


    Gegenstand des Projekts sind die Vielfalt und die Dynamik der verschiedenen Dialekte in Österreich. Auf der Grundlage einer neuen Erhebung sollen in den nächsten Jahren unterschiedliche Forschungsfragen beantwortet werden. Diese lauten etwa: Welche Unterschiede und Veränderungen (z.B. im Wege von Konvergenz-und Divergenzprozessen) lassen sich innerhalb der und zwischen den österreichischen Dialektlandschaften beobachten? Welche Unterschiede im Dialektwandel gibt es zwischen städtischen und ländlichen Gebieten? Lassen sich Generationen- und Genderunterschiede feststellen, die den Dialektwandel betreffen? Welchen Beitrag kann ein umfassender Vergleich von ,real-time‘-und ,apparent-time‘-Analysen zu einer allgemeinen Sprachwandeltheorie leisten?

    Zur Beantwortung dieser Fragestellungen werden in der ersten Erhebungsphase an 40 österreichischen Orten Sprachproben von insgesamt 160 Dialektsprecherinnen und -sprechern aus zwei verschiedenen Altersgruppen aufgenommen und analysiert. Weiters werden von ausgewählten Sprecher/inne/n Aufnahmen im Sprachlabor durchgeführt, um Eigenheiten in der Aussprache phonetisch möglichst exakt bestimmen zu können. In der zweiten Erhebungsphase werden an 100 weiteren Standorten in Österreich ergänzende Laboraufnahmen durchgeführt, um die Unterschiede und die Bewegungen zwischen den Dialektlandschaften noch genauer analysieren zu können. Hier sollen auch neueste dialektometrische Verfahren zum Einsatz kommen, um probabilistische Aussagen in Bezug auf die Variation und den Wandel der Dialekte in Österreich treffen zu können. Die Analysen betreffen alle sprachlichen Ebenen von der Aussprache bis zur Grammatik und zum Wortschatz. Die Dokumentation der gewonnenen Daten erfolgt u. a. digital. Es ist geplant, die Daten am Ende auf der Plattform „Deutsch in Österreich“ einem breiten Publikum zugänglich zu machen, insbesondere in Form des ersten ,sprechenden Sprachatlas‘ von ganz Österreich.

    Projekthomepage der Kooperationspartner in Salzburg


  • Objective:

    One of the biggest problems encountered when building numerical models for layers is the lack of exact deterministic material parameters. Therefore, stochastic models should be use. However, these models have the general drawback of overusing computer resources. This project developed a stochastic model with the ability to use a shear modulus in conjunction with a special iteration scheme allowing efficient implementation.


    With the Karhunen Loeve Expansion (KLE), it is possible to split the stochastic shear modulus, and therefore the whole system, into a deterministic and a stochastic part. These parts can then be transformed into a linear system of equations using finite elements and Chaos Polynomial Decomposition. Combining the KLE and the Fourier Transformation in combination with Plancherel's theorem enables decoupling of the deterministic part into smaller subsystems. An iteration scheme was developed which narrows the application of "costly" routines to only these smaller deterministic subsystems, instead of the whole higher dimensional (up to a dimension of 10,000) system matrix.


    As concerns about vibrations produced by machinery and traffic have increased in past years, models that can predict vibrations in soil became more important. However, since material parameters for soil layers cannot be measured exactly in practice, it is reasonable to use stochastic models.

  • Vowel and consonant quantity in Southern German varieties: D - A - CH project granted by DFG, FWF, SNF

    Principal investigators: Felicias Kleber, Michael Pucher, Sylvia Moosmüller†, Stephan Schmid 

    Start of the project: 1st of June 2015


    Viele Studien haben eine Beziehung zwischen diachronem Wandel und synchroner Variation, Spracherwerb, internen und externen Faktoren gezeigt. In diesem Vorhaben werden wir diese Themen mittels einer methodisch umfangreichen Studie zur (In-)Stabilität von Quantitätsverhältnissen in Vokal-Konsonant-Sequenzen (VK) in Standardvarietäten und Dialekten des süddeutschen Sprachraums untersuchen. Während die Quantitätsverhältnisse im Schweizerdeutschen diachron relativ stabil geblieben sind (auch moderne Sprachstufen haben einen Vokal- und einen Konsonantenlängen-kontrast), ist der althochdeutsche Konsonantenlängenkontrast im Standarddeutschen verloren gegangen; in den mittelbairischen Varietäten wiederum sind Vokal- und Konsonantenlänge komplementär verteilt. Die bairischen Quantitätsverhältnisse scheinen sich allerdings zu ändern, vermutlich aufgrund von Dialektausgleich. Das Vorhaben ergreift die Gelegenheit, in einer großangelegten Apparent-time-Studie diesen prosodischen Wandel im Fortschritt zu untersuchen und die instabilen Quantitätsverhältnisse mit stabileren in anderen Varietäten zu vergleichen. Das Hauptziel ist die Modellierung der Bedingungen, unter denen sich Quantitätsverhältnisse diachron ändern, um so ein besseres Verständnis prosodischer Wandelprozesse in den Sprachen der Welt zu gewinnen. Das Vorhaben enthält vier konkrete Ziele: (1) eine Typologie der Quantitäten in oberdeutschen Dialekten und den drei nationalen Standardvarietäten aufzustellen; (2) den prosodischen Wandel weiterführend und die Gründe für Wandelprozesse in einigen Varietäten zu untersuchen; (3) den Einfluss interner und externer Faktoren (z.B. Sprechgeschwindigkeit vs. Spracheinstellungen) auf die synchrone Variation in der Produktions-Perzeptionsbeziehung bei Erwachsenen zu prüfen; (4) die Entwicklung von Quantitätsverhältnissen im Erstspracherwerb und eine mögliche Beziehung zwischen synchroner Variation in Kindersprache und prosodischem Wandel zu untersuchen. Die Innovation des Vorhabens liegt in der großangelegten, Varietäten übergreifenden Apparent-time-Studie, die auch Kinder verschiedenen Alters einschließt, der Kombination aus artikulatorischen, akustischen und auditiven Methoden, und der Berücksichtigung sozialer und phonetischer Faktoren.Das geplante Projekt verbindet die am ARI (Wien) etablierten akustischen und soziophonetischen Analysen von Quantitätsverhältnissen in österreichischen Varietäten mit den am IPS (München) entwickelten akustischen, perzeptiven und physiologischen Methoden zur Produktions-Perzeptionsbeziehung bei Sprechern aller Altersgruppen und den am Phonetischen Laboratorium (Zürich) etablierten Methoden zur Typologisierung schweizerdeutscher Varietäten. Langfristig dient die Kollaboration dazu ein Lautwandelmodel an der Phonetik-Phonologie-Schnittstelle zu entwickeln, das die durch dieses Projekt gewonnenen Ergebnisse zu internen und externen Faktoren in der Stabilität von Quantität, Spracherwerb und linguistischem Wandel integriert.