FWF

  • The FWF project "Time-Frequency Implementation of HRTFs" has started.

    Principal Investigator: Damian Marelli

    Co-Applicants: Peter BalazsPiotr Majdak

  • Proposal for a Master studentship (f/m)

     

    Title: Measurements of auditory time-frequency masking kernels for various masker frequencies and levels.

     

    Duration: 6 months, working time = 20 hours/week.

     

    Starting date: ASAP.

     

    Closing date for applications: until the position is filled.

    Description

     

    Background:Over the last decades, many psychoacoustical studies investigated auditory masking, an important property of auditory perception. Masking refers to the degradation of the detection of a sound (referred to as the “target”) in presence of another sound (the “masker”). In the literature, masking has been extensively investigated with simultaneous (spectral masking) and non-simultaneous (temporal masking) presentation of masker and target. The results were used to develop models of either spectral or temporal masking. Attempts were made to simply combine these models to account for time-frequency masking in perceptual audio codecs like mp3. However, a recent study on time-frequency masking conducted at our lab [1] revealed the inaccuracy of such simple models. The development of an efficient model of time-frequency masking for short-duration and narrow-band signals still remains a challenge. For instance, such a model is crucial for the prediction of masking in time-frequency representations of sounds and is expected to improve current perceptual audio codecs.

     

    In the previous study [1], the time-frequency masking kernel for a 10-ms Gaussian-shaped sinusoid was measured at a frequency of 4 kHz and a sensation level of 60 dB. A Gaussian envelope is used because it allows for maximum compactness in the time-frequency domain. While these data constitute a crucial basis for the development of an efficient model of time-frequency masking, additional psychoacoustical data are required, particularly the time-frequency masking kernels for different Gaussian masker frequencies and sensation levels.

     

    The proposed work is part of the ongoing research project POTION: “Perceptual Optimization of audio representaTIONs and coding”, jointly funded by the Austrian Science Fund (FWF) and the French National Research Agency (ANR).

     

    Aims:The first goal of the work is to conduct psychoacoustical experiments to measure the time-frequency masking kernels for three masker sensation levels (20, 40, and 60 dB) and three masker frequencies (0.75, 4.0, and 8.0 kHz) following the methods in [1]. This part will consist in experimental design, programming, and data collection. The second goal of the work is to interpret the data and compare them to literature data for maskers with various spectro-temporal shapes. This step shall involve the use of state-of-the-art models of the auditory periphery to predict the data.

     

    Applications:The data will be used to develop a new model of time-frequency masking that should later be implemented and tested in a perceptual audio codec.

     

    Required skills: Qualification for a Master thesis, knowledge in psychophysical methods andpsychoacoustics, experience with auditory models would be a plus, Matlab programming, good communication, proper spoken/written English.

     

    Gross salary: 948.80€/month.

     

    Supervisors: Thibaud Necciari and Bernhard Laback
    Emails: Diese E-Mail-Adresse ist vor Spambots geschützt! Zur Anzeige muss JavaScript eingeschaltet sein! / Diese E-Mail-Adresse ist vor Spambots geschützt! Zur Anzeige muss JavaScript eingeschaltet sein!
    Tel: +43 1 51581-2538

     

    Reference:

    [1] T. Necciari. Auditory time-frequency masking: Psychoacoustical measures and application to the analysis-synthesis of sound signals. PhD thesis, Aix-Marseille I University, France, October 2010. Available online.

  • BE-SyMPHONic: French-Austrian joint project granted by ANR and FWF

    Principal investigators: Basilio Calderone, Wolfgang U. Dressler
    Co-applicants: Hélène Giraudo, Sylvia Moosmüller

    Start of the project: 13th January 2014

    Introduction:

    Language sounds are realized in several different ways. Every language exploits no more than a sub-set of the sounds that the vocal tract can produce, as well as a reduced number of their possible combinations. The restrictions and the phonemic combinations allowed in the lanquage define a branch of phonology so-called phonotactics.

    Phonotactics refers to the sequential arrangement of phonemic segments in morphemes, syllables, and words and underlies a wide range of phonological issues, from acceptability judgements (pseudowords like <poiture>in French or <Traus>in German are phonotactically plausible) to syllable processes (the syllabic structure in a given language is based on the phonotactic permission in that language) and the nature and length of possible consonant clusters (that may be seen as intrinsically marked structures with respect to the basic CV template).

    Objective:

    Exploring the psycho-computational representation of the phonotactics in French and German is the aim of this research project.

    In particular, our researh will focus on the interplay between phonotactics and word structure in French and German, and investigate the behavioural and computational representations of phonotactic vs. morphonotactic clusters.

    As a matter of fact, the basic hypothesis underlying this research project ist that there exist different cognitive and computational representations for the same consonant cluster according to its phonotactic setting. In particular, the occurence of a cluster across a morpheme boundary (morphonotactic cluster) is considered as particularly interesting.

    Method:

    Our research will focus on the interplay between phonotactis and morphology and investigate the behavioural and computational representations of consonant clusters according to whether they are: a) exclusively phonotactic clusters, i.e. the consonant cluster occurs only without morpheme boundaries (e.g.Steinin German); b) exclusively morphonotactic clusters, i.e. the consonant cluster occurs only beyond morpheme boundaries (e.g.lach+st), c) both are true with one of the two being more or less dominant (e.g. dominantlob+stvs.Obst)[1]. Thus we test the existence of different ‘cognitive and computational representations’ and processes for the same and for similar consonant clusters according to their appartenance to a) or b) or c).

    The central hypothesis which we test is that speakers not only reactively exploit the potential boundary signaling function of clusters that result from morphological operations, but take active measures to maintain or even enhance this functionality, for example by treating morphologically produced clusters differently than morpheme internal clusters in production or language acquisition. We call this hypothesis, the ‘Strong Morphonotactic Hypothesis’ (henceforth: SMH) (Dressler & Dziubalska-Koɫaczyk 2006, Dressler, Dziubalska-Koɫaczyk & Pestal 2010).

    In particular, we suppose that sequences of phonemes exhibiting morpheme boundaries (the ‘morphonotactic clusters’) should provide the speakers with functional evidence about the morphological operation occurred in that sequence; such evidence should be absent in the case of a sequence of phonemes without morpheme boundaries (the ‘phonotactic clusters’).

    Hence our idea is to investigate the psycho-computational mechanisms underlying the phonotactic-morphonotactic distinction by approaching the problem from two angles simultaneously: (a) psycholinguistic experimental study of language acquisition and production and (b) language computational modelling.

    We aim therefore at providing, on one hand, the psycholinguistic and behavioural support to the hypothesis that morphologically produced clusters are treated differently than morpheme internal clusters in French and German; on the other, we will focus on the distributional and statistical properties of the language in order to verify whether such difference in clusters’ treatment can be inductively modelled by appealing to distributional regularities of the language.

    The competences of the two research teams overlap and complement each other. The French team will lead in modelling, computational simulation and psycholinguistic experiments, the Austrian team in first language acquisition, phonetic production and microdiachronic change. These synergies are expected to enrich each group in innovative ways.


    [1] An equivalent example for French language is given by a)prise(/priz/ ‘grip’, exclusively phonotactic cluster), b)affiche+ rai(/afiʃʁɛ/ ‘I (will) post’, exclusively morphonotactic cluster) and c)navigue+ rai(/naviɡʁɛ/ ‘I (will) sail’) vs.engrais(/ãɡʁɛ/ ‘fertilizer’), the both conditions are true with morphonotactic condition as dominant.

  • BiPhase:  Binaural Hearing and the Cochlear Phase Response

    Project Description

    While it is often assumed that our auditory system is phase-deaf, there is a body of literature showing that listeners are very sensitive to phase differences between spectral components of a sound. Particularly, for spectral components falling into the same perceptual filter, the so-called auditory filter, a change in relative phase across components causes a change in the temporal pattern at the output of the filter. The phase response of the auditory filter is thus important for any auditory tasks that rely on within-channel temporal envelope information, most notably temporal pitch or interaural time differences.

    Within-channel phase sensitivity has been used to derive a psychophysical measure of the phase response of auditory filters (Kohlrausch and Sanders, 1995). The basic idea of the widely used masking paradigm is that a harmonic complex whose phase curvature roughly mirrors the phase response of the auditory filter spectrally centered on the complex causes a maximally modulated (peaked) internal representation and, thus, elicits minimal masking of a pure tone target at the same center frequency. Therefore, systematic variation of the phase curvature of the harmonic complex (the masker) allows to estimate the auditory filter’s phase response: the masker phase curvature causing minimal masking reflects the mirrored phase response of the auditory filter.

    Besides the obvious importance of detecting the target in the temporal dips of the masker, particularly of the target is short compared to the modulation period of the masker (Kohlrausch and Sanders, 1995), there are several indications that fast compression in the cochlea is important to obtain the masker-phase effect (e.g., Carlyon and Datta, 1997; Oxenham and Dau, 2004). One indication is that listeners with sensorineural hearing impairment (HI), characterized by reduced or absent cochlear compression due to loss of outer hair cells, show only a very weak masker-phase effect, making it difficult to estimate the cochlear phase response.

    In the BiPhase project we propose a new paradigm for measuring the cochlear phase response that does not rely on cochlear compression and thus should be applicable in HI listeners. It relies on the idea that the amount of modulation (peakedness) in the internal representation of a harmonic complex, as given by its phase curvature, determines the listener’s sensitivity to envelope interaural time difference (ITD) imposed on the stimulus. Assuming that listener’s sensitivity to envelope ITD does not rely on compression, systematic variation of the stimulus phase curvature should allow to estimate the cochlear phase response both in normal-hearing (NH) and HI listeners. The main goals of BiPhase are the following:

    • Aim 1: Assessment of the importance of cochlear compression for the masker-phase effect at different masker levels. Masking experiments are performed with NH listeners using Schroeder-phase harmonic complexes with and without a precursor stimulus, intended to reduce cochlear compression by activation of the efferent system controlling outer-hair cell activity. In addition, a quantitative model approach is used to estimate the contribution of compression from outer hair cell activity and other factors to the masker-phase effect. The results are described in Tabuchi, Laback, Necciari, and Majdak (2016). A follow-up study on the dependency of the masker-phase effect on masker and target duration, the target’s position within the masker, the masker level, and the masker bandwidth and conclusions on the role of compression of underlying mechanisms in simultaneous and forward masking is underway.
    • Aim 2: Development and evaluation of an envelope ITD-based paradigm to estimate the cochlear phase response. The experimental results on NH listeners, complemented with a modeling approach and predictions, are described in Tabuchi and Laback (2017). This paper also provides model predictions for HI listeners.
      Besides the consistency of the overall pattern of ITD thresholds across phase curvatures with data on the masking paradigm and predictions of the envelope ITD model, an unexpected peak in the ITD thresholds was found for a negative phase curvature which was not predicted by the ITD model and is not found in masking data. Furthermore, the pattern of results for individual listeners appeared to reveal more variability than the masking paradigm. Data were also collected with an alternative method, relying on the extent of laterality of a target with supra-threshold ITD, as measured with an interaural-level-difference-based pointing stimulus. These data showed no nonmonotonic behavior at negative phase curvatures. Rather, they showed good correspondence with the ITD model prediction and more consistent results across individuals compared to the ITD threshold-based method (Zenke, Laback, and Tabuchi, 2016).
    • Aim 3: Development of a ITD-based method to account for potentially non-uniform curvatures of the phase response in HI listeners. Using two independent iterative approaches, NH listeners adjusted the phase of individual harmonics of an ITD-carrying complex so that it elicited maximum extent of laterality. Although the pattern of adjusted phases very roughly resembled the expected pattern, there was a large amount of uncertainty (Zenke, 2014), preventing the method from further use. Modified versions of the method will be considered in a future study.

    Funding

    This project is funded by the Austrian Science Fund (FWF, Project # P24183-N24, awarded to Bernhard Laback). It run from 2013 to 2017

    Publications

    Peer-reviewed papers

    • Tabuchi, H. and Laback, B. (2017): Psychophysical and modeling approaches towards determining the cochlear phase response based on interaural time differences, The Journal of the Acoustical Society of America 141, 4314–4331.
    • Tabuchi, H., Laback, B., Necciari, T., and Majdak, P (2016). The role of compression in the simultaneous masker phase effect, The Journal of the Acoustical Society of America 140, 2680-2694.

    Conference talks

    • Tabuchi, H., Laback, B., Majdak, P., and Necciari, T. (2014). The role of precursor in tone detection with Schroeder-phase complex maskers. Poster presented at 37th Association for Research in Otolaryngology (ARO) Meeting, San Diego, California.
    • Tabuchi, H., Laback, B., Majdak, P., and Necciari, T. (2014). The perceptual consequences of a precursor on tone detection with Schroeder-phase harmonic maskers. Invited talk at Alps Adria Acoustics Association, Graz, Austria.
    • Tabuchi, H., Laback, B., Majdak, P., Necciari, T., and Zenke,K. (2015). Measuring the auditory phase response based on interaural time differences. Talk at 169th Meeting of the Acoustical Society of America, Pittsburgh, Pennsylvania.
    • Zenke, K., Laback, B., and Tabuchi, H. (2016). Towards an Efficient Method to Derive the Phase Response in Hearing-Impaired Listeners. Talk at 37th Association for Research in Otolaryngology (ARO) Meeting, San Diego, California.
    • Tabuchi, H., Laback, B., Majdak, P., Necciari, T., and Zenke, K. (2016). Modeling the cochlear phase response estimated in a binaural task. Talk at 39th Association for Research in Otolaryngology (ARO) Meeting, San Diego, California.
    • Laback, B., and Tabuchi, H. (2017). Psychophysical and modeling approaches towards determining the cochlear phase response based on interaural time differences. Invited Talk at AABBA Meeting, Vienna, Austria.
    • Laback, B., and Tabuchi, H. (2017). Psychophysical and Modeling Approaches towards determining the Cochlear Phase Response based on Interaural Time Differences. Invited Talk at 3rd Workshop “Cognitive neuroscience of auditory and cross-modal perception, Kosice, Slovakia.

    References

    • Carlyon, R. P., and Datta, A. J. (1997). "Excitation produced by Schroeder-phase complexes: evidence for fast-acting compression in the auditory system," J Acoust Soc Am 101, 3636-3647.
    • Kohlrausch, A., and Sander, A. (1995). "Phase effects in masking related to dispersion in the inner ear. II. Masking period patterns of short targets," J Acoust Soc Am 97, 1817-1829.
    • Oxenham, A. J., and Dau, T. (2004). "Masker phase effects in normal-hearing and hearing-impaired listeners: evidence for peripheral compression at low signal frequencies," J Acoust Soc Am 116, 2248-2257.

    See also

    Potion

  • Biotop Beschreibung
    Workflow Biotop

    Einführung

    Die Lokalisierung von Schallquellen spielt eine wichtige Rolle im täglichem Leben. Die Form des menschlichen Kopfs, des Torsos und vor allem des Außenohrs (Pinna) bewirken einen Filtereffekt für einfallenden Schall und spielen daher eine wichtige Rolle bei der Ortung einer Schallquelle. Dieser Filtereffekt kann mittels der s.g. head related transfer functions (HRTFs, kopfbezogene Übertragungsfunktionen) beschrieben werden. Diese Filterfunktionen können mittels numerischer Methoden (zum Beispiel der Randelemente Methode, BEM) berechnet werden. In BIOTOP sollen diese Berechnungen durch Anwendung adaptiver Wavelet und Frame Methoden effizienter gemacht werden.

    Ziel

    Verglichen mit den herkömmlichen BEM Ansatzfunktionen haben Wavelets den Vorteil, besser an gegebene Schallverteilungen angepasst werden zu können. Als Verallgemeinerung von Wavelets sollen Frames dabei helfen, eine noch flexiblere Berechnungsmethode und damit eine noch bessere Anpassung an das gegebene Problem zu entwickeln. BIOTOP verbindet abstrakte mathematische Theorie mit numerischer und angewandter Mathematik. BIOTOP ist ein internationales DACH-Projekt (DFG-FWF-SFG) zwischen der Philipps-Universität Marburg (Stephan Dahlke), der Unicersität Basel (Helmut Harbrecht) und dem Institut für Schallforschung. Die gemeinsame Erfahrung dieser drei Forschungsgruppen soll helfen, neue numerische Strategien und Methoden zu entwickeln. Das Projekt wird vom FWF (Proj. Nummer: I-1018 N25) gefördert.

     

  • START project of P. Balazs.

    FLAME

     

    Diese Seite ist eine Projektbeschreibung und als solche in englischer Sprache verfasst.

    This international, multi-disciplinary and team-oriented project will expand the group Mathematics and Acoustical Signal Processing at the Acoustic Research Institute in cooperation with NuHAG Vienna (Hans G. Feichtinger, M. Dörfler, K. Gröchenig), Institute of TelecommunicationVienna (Franz Hlawatsch), LATP Marseille (Bruno Torrésani) LMA (Richard Kronland-Martinet). CAHR (Torsten Dau, Peter Soendergaard), the FYMA Louvain-la-Neuve (Jean-Pierre Antoine), AG Numerics (Stephan Dahlke), School of Electrical Engineering and Computer Science (Damian Marelli) as well as the BKA Wiesbaden (Timo Becker).

    Within the institute the groups Audiological Acoustics and Psychoacoutics, Computational Acoustics, Acoustic Phonetics and Software Development are involved in the project.

    This project is funded by the FWF as a START price . It is planned to run from May 2012 to April 2018.

     

    Workshops:

     

    Multipliers

     

    General description:

    We live in the age of information where the analysis, classification, and transmission of information is f essential importance. Signal processing tools and algorithms form the backbone of important technologieslike MP3, digital television, mobile phones and wireless networking. Many signal processing algorithms have been adapted for applications in audio and acoustics, also taking into account theproperties of the human auditory system.

    The mathematical concept of frames describes a theoretical background for signal processing. Frames are generalizations of orthonormal bases that give more freedom for the analysis and modificationof information - however, this concept is still not firmly rooted in applied research. The link between the mathematical frame theory, the signal processing algorithms, their implementations andfinally acoustical applications is a very promising, synergetic combination of research in different fields.

    Therefore the main goal of this multidisciplinary project is to

    -> Establish Frame Theory as Theoretical Backbone of Acoustical Modeling

    in particular in psychoacoustics, phonetic and computational acoustics as well as audio engineering.

    Overview

     

    For this auspicious connection of disciplines, FLAME will produce substantial impact on both the heory and applied research.

    The theory-based part of FLAME consists of the following topics:

    • T1 Frame Analysis and Reconstruction Beyond Classical Approaches
    • T2 Frame Multipliers, Extended
    • T3 Novel Frame Representation of Operators Motivated by Computational Acoustics

    The application-oriented part of FLAME consists of:

    • A1 Advanced Frame Methods for Perceptual Sparsity in the Time-Frequency Plane
    • A2 Advanced Frame Methods for the Analysis and Classification of Speech
    • A3 Advanced Frame Methods for Signal Enhancement and System Estimation

    Press information:

     

     

     

  • Localization of sound sources is an important task of the human auditory system and much research effort has been put into the development of audio devices for virtual acoustics, i.e. the reproduction of spatial sounds via headphones. Even though the process of sound localization is not completely understood yet, it is possible to simulate spatial sounds via headphones by using head-related transfer functions (HRTFs). HRTFs describe the filtering of the incoming sound due to head, torso and particularly the pinna and thus they strongly depend on the particular details in the listener's geometry. In general, for realistic spatial-sound reproduction via headphones, the individual HRTFs must be measured. As of 2012, the available HRTF acquisition methods were acoustic measurements: a technically-complex process, involving placing microphones into the listener's ears, and lasting for tens of minutes.

    In LocaPhoto, we were working on an easily accessible method to acquire and evaluate listener-specific HRTFs. The idea was to numerically calculate HRTFs based on a geometrical representation of the listener (3-D mesh) obtained from 2-D photos by means of photogrammetric reconstruction.

    As a result, we have developed a software package for numerical HRTF calculations, a method for geometry acquisition, and models able to evaluate HRTFs in terms of broadband ITDs and sagittal-plane sound localization performance.

     

    Further information:

    http://www.kfs.oeaw.ac.at/LocaPhoto

     

  • French-Austrian bilateral research project funded by the French National Agency of Research (ANR) and the Austrian Science Fund (FWF, project no. I 1362-N30). The project involves two academic partners, namely the Laboratory of Mechanics and Acoustics (LMA - CNRS UPR 7051, France) and the Acoustics Research Institute. At the ARI, two research groups are involved in the project: the Mathematics and Signal Processing in Acoustics and the Psychoacoustics and Experimental Audiology groups.

    Principal investigators: Thibaud Necciari (ARI), Piotr Majdak (ARI) and Olivier Derrien (LMA).

    Running period: 2014-2017 (project started on March 1, 2014).

    Abstract:

    One of the greatest challenges in signal processing is to develop efficient signal representations. An efficient representation extracts relevant information and describes it with a minimal amount of data. In the specific context of sound processing, and especially in audio coding, where the goal is to minimize the size of binary data required for storage or transmission, it is desirable that the representation takes into account human auditory perception and allows reconstruction with a controlled amount of perceived distortion. Over the last decades, many psychoacoustical studies investigated auditory masking, an important property of auditory perception. Masking refers to the degradation of the detection threshold of a sound in presence of another sound. The results were used to develop models of either spectral or temporal masking. Attempts were made to simply combine these models to account for time-frequency (t-f) masking effects in perceptual audio codecs. We recently conducted psychoacoustical studies on t-f masking. They revealed the inaccuracy of those models which revealed the inaccuracy of such simple models. These new data on t-f masking represent a crucial basis to account for masking effects in t-f representations of sounds. Although t-f representations are standard tools in audio processing, the development of a t-f representation of audio signals that is mathematically-founded, perception-based, perfectly invertible, and possibly with a minimum amount of redundancy, remains a challenge. POTION thus addresses the following questions:

    1. To what extent is it possible to obtain a perception-based (i.e., as close as possible to “what we see is what we hear”), perfectly invertible, and possibly minimally redundant t-f representation of sound signals? Such a representation is essential for modeling complex masking interactions in the t-f domain and is expected to improve our understanding of auditory processing of real-world sounds. Moreover, it is of fundamental interest for many audio applications involving sound analysis-synthesis.
    2. Is it possible to improve current perceptual audio codecs by considering a joint t-f approach? To reduce the size of digital audio files, perceptual audio codecs like MP3 decompose sounds into variable-length time segments, apply a frequency transform, and use masking models to control the sub-quantization of transform coefficients within each segment. Thus, current codecs follow mainly a spectral approach, although temporal masking effects are taken into account in some implementations. By combining an efficient perception-based t-f transform with a joint t-f masking model in an audio codec, we expect to achieve significant performance improvements.

    Working program:

    POTION is structured in three main tasks:

    1. Perception-based t-f representation of audio signals with perfect reconstruction: A linear and perfectly invertible t-f representation will be created by exploiting the recently developed non-stationary Gabor theory as a mathematical background. The transform will be designed so that t-f resolution mimics the t-f analysis properties by the auditory system and possibly no redundancy is introduced to maximize the coding efficiency.
    2. Development and implementation of a t-f masking model: Based on psychoacoustical data on t-f masking collected by the partners in previous projects and on literature data, a new, complex model of t-f masking will be developed and implemented in the computationally efficient representation built in task 1. Additional psychoacoustical data required for the development of the model, involving frequency, level, and duration effects in masking for either single or multiple maskers will be collected. The resulting signal processing algorithm should represent and re-synthesize only the perceptually relevant components of the signal. It will be calibrated and validated by conducting listening tests with synthetic and real-world sounds.
    3. Optimization of perceptual audio codecs: This task represents the main application of POTION. It will consist in combining the new efficient representation built in task 1 with the new t-f masking model built in task 2 for implementation in a perceptual audio codec.

    More information on the project can be found on the POTION web page.

    Publications:

    • Chardon, G., Necciari, Th., Balazs, P. (2014): Perceptual matching pursuit with Gabor dictionaries and time-frequency masking, in: Proceedings of the 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014). Florence, Italy, 3126-3130. (proceedings) ICASSP 2014: Perceptual matching pursuit results

    Related topics investigated at the ARI:

  • Baumgartner et al. (2017a)

    Räumliches Hören ist wichtig, um die Umgebung ständig auf interessante oder gefährliche Geräusche zu überwachen und gezielt die Aufmerksam auf sie richten zu können. Die räumliche Trennung der beiden Ohren und die komplexe Geometrie des menschlichen Körpers liefern akustische Information über den Ort einer Schallquelle. Je nach Schalleinfallsrichtung verändert v.a. die Ohrmuschel das Klangspektrum, bevor der Schall das Trommelfell erreicht. Da die Ohrmuschel sehr individuell geformt ist (mehr noch als ein Fingerabdruck), ist auch deren Klangfärbung sehr individuell. Für die künstliche Erzeugung realistischer Hörwahrnehmungen muss diese Individualität so präzise wie nötig abgebildet werden, wobei bisher nicht geklärt ist, was wirklich nötig ist. SpExCue hat deshalb nach elektrophysiologischen Maßen und Vorhersagemodellen geforscht, die abbilden können, wie räumlich realistisch („externalisiert“) eine virtuelle Quelle empfunden wird.

    Da künstliche Quellen vorzugsweise im Kopf wahrgenommen werden, eignete sich die Untersuchung dieser Klangspektren zugleich zur Erforschung einer Verzerrung in der Hörwahrnehmung: Schallereignisse, die sich dem Hörer annähern, werden intensiver wahrgenommen als jene, die sich vom Hörer entfernen. Frühere Studien zeigten diese Verzerrung ausschließlich durch Lautheitsänderungen (zunehmende/abnehmende Lautheit wurde verwendet um sich nähernde/entfernende Schallereignisse zu simulieren). Es war daher unklar, ob die Verzerrung wirklich auf Wahrnehmungsunterschiede gegenüber der Bewegungsrichtung oder nur auf die unterschiedlichen Lautstärken zurück zu führen sind. Unsere Studie konnte nachweisen, dass räumliche Änderungen der Klangfarbe diese Verzerrungen (auf Verhaltensebene und elektrophysiologisch) auch bei gleichbleibender Lautstärke hervorrufen können und somit von einer allgemeinen Wahrnehmungsverzerrung auszugehen ist.

    Des Weiteren untersuchte SpExCue, wie die Kombination verschiedener räumlicher Hörinformation die Aufmerksamkeitskontrolle in einer Spracherkennungsaufgabe mit gleichzeitigen Sprechern, wie z.B. bei einer Cocktailparty, beeinflusst. Wir fanden heraus, dass natürliche Kombinationen räumlicher Hörinformation mehr Gehinraktivität in Vorbereitung auf das Testsignal herrufen und dadurch die neurale Verarbeitung der zu folgenden Sprache optimiert wird.

    SpExCue verglich außerdem verschiedene Ansätze von Berechnungsmodellen, die darauf abzielen, die räumliche Wahrnehmung von Klangänderungen vorherzusagen. Obwohl viele frühere experimentelle Ergebnisse von mindestens einem der Modellansätze vorhergesagt werden konnten, konnte keines von ihnen all diese Ergebnisse erklären. Um das zukünftige Erstellen von allgemeingültigeren Berechnungsmodellen für den räumlichen Hörsinn zu unterstützen, haben wir abschließend ein konzeptionelles kognitives Modell dafür entwickelt.

    Funding

    Erwin-Schrödinger Fellowship from Austrian Science Funds (FWF, J3803-N30) awarded to Robert Baumgartner. Duration: May 2016 - November 2017.

    Follow-up funding provided by Oculus VR, LLC, since March 2018. Project Investigator: Robert Baumgartner.

    Publications

    • Baumgartner, R., Reed, D.K., Tóth, B., Best, V., Majdak, P., Colburn H.S., Shinn-Cunningham B. (2017a): Asymmetries in behavioral and neural responses to spectral cues demonstrate the generality of auditory looming bias, in: Proceedings of the National Academy of Sciences of the USA 114, 9743-9748. (article)
    • Baumgartner, R., Majdak, P., Colburn H.S., Shinn-Cunningham B. (2017b): Modeling Sound Externalization Based on Listener-specific Spectral Cues, presented at: Acoustics ‘17 Boston: The 3rd Joint Meeting of the Acoustical Society of America and the European Acoustics Association. Boston, MA, USA. (conference)
    • Deng, Yuqi, Choi, Inyong, Shinn-Cunningham, Barbara G., Baumgartner, Robert (2019): Impoverished auditory cues fail to engage brain networks controlling spatial selective attention, in: bioRxiv, 533117. (preprint)
    • Majdak, Piotr, Baumgartner, Robert, Jenny, Claudia (2019): Formation of three-dimensional auditory space, in: arXiv:1901.03990 [q-bio]. (preprint)
  • Projektteil 02 des Sonderforschungsbereichs Deutsch in Österreich. Variation - Kontakt - Perzeptionfinanziert vom FWF (FWF6002) in Kooperation mit der Universität Salzburg

    Projektleitung: Stephan Elspaß, Hannes Scheutz, Sylvia Moosmüller

    Beginn des Projekts: 1. Jänner 2016

    Projektbeschreibung:

    Gegenstand des Projekts sind die Vielfalt und die Dynamik der verschiedenen Dialekte in Österreich. Auf der Grundlage einer neuen Erhebung sollen in den nächsten Jahren unterschiedliche Forschungsfragen beantwortet werden. Diese lauten etwa: Welche Unterschiede und Veränderungen (z.B. im Wege von Konvergenz-und Divergenzprozessen) lassen sich innerhalb der und zwischen den österreichischen Dialektlandschaften beobachten? Welche Unterschiede im Dialektwandel gibt es zwischen städtischen und ländlichen Gebieten? Lassen sich Generationen- und Genderunterschiede feststellen, die den Dialektwandel betreffen? Welchen Beitrag kann ein umfassender Vergleich von ,real-time‘-und ,apparent-time‘-Analysen zu einer allgemeinen Sprachwandeltheorie leisten?

    Zur Beantwortung dieser Fragestellungen werden in der ersten Erhebungsphase an 40 österreichischen Orten Sprachproben von insgesamt 160 Dialektsprecherinnen und -sprechern aus zwei verschiedenen Altersgruppen aufgenommen und analysiert. Weiters werden von ausgewählten Sprecher/inne/n Aufnahmen im Sprachlabor durchgeführt, um Eigenheiten in der Aussprache phonetisch möglichst exakt bestimmen zu können. In der zweiten Erhebungsphase werden an 100 weiteren Standorten in Österreich ergänzende Laboraufnahmen durchgeführt, um die Unterschiede und die Bewegungen zwischen den Dialektlandschaften noch genauer analysieren zu können. Hier sollen auch neueste dialektometrische Verfahren zum Einsatz kommen, um probabilistische Aussagen in Bezug auf die Variation und den Wandel der Dialekte in Österreich treffen zu können. Die Analysen betreffen alle sprachlichen Ebenen von der Aussprache bis zur Grammatik und zum Wortschatz. Die Dokumentation der gewonnenen Daten erfolgt u. a. digital. Es ist geplant, die Daten am Ende auf der Plattform „Deutsch in Österreich“ einem breiten Publikum zugänglich zu machen, insbesondere in Form des ersten ,sprechenden Sprachatlas‘ von ganz Österreich.

    Projekthomepage der Kooperationspartner in Salzburg

     

  • FWF DACH I 536-G20: 2011-2013
    Cooperation with the Institute of Phonetics and Speech Processing, LMU Munich.

    Project leader (Austria): Sylvia Moosmüller
    Project leader (Germany): Jonathan Harrington

    Objective:

    Across languages, the distinction between so-called tense and lax vowels, e.g., Miete - Mitte ("rent" - "center") or Höhle - Hölle ("cave" - "hell"), is encountered in many languages. However, many different articulatory adjustments might cause this distinction, and these are language-specific.

    In the current project, we address this issue by analysing high tense and lax vowel pairs of the type bieten - bitten ("to offer" - "to request"), Hüte - Hütte ("hats" - "hut"), and Buße - Busse ("penance" - "busses") in two related language varieties: Standard Austrian German (SAG) and Standard German German (SGG). Previous studies suggest that high lax vowel pairs like bitten, Hütte, or Busse tend to approximate their respective tense cognates bieten, Hüte, and Buße.

    The research questions were investigated by a) comparing the tense and lax vowel pairs in SAG and SGG, b) by investigating whether high lax vowel pairs approximate their tense cognates in SAG, c) by investigating whether the high vowel pairs in SAG are distinguished by quality, by quantity, or by quantity relations with the following consonant, and d) by investigating whether an ongoing sound change can be observed in SAG, with young SAG speakers exhibiting a higher degree to merge the vowels than old SAG speakers.

    Main Results:

    SGG speakers clearly distinguish the high vowel pairs by quality, whereas speaker-specific strategies can be observed in SAG, with some speakers distinguishing high tense and lay vowel pairs by quality, others merging the quality contrast, but restricting the merger to velar contexts only, and still others merging high tense and lax vowels alltogether. In case of distinction, the differences between high tense and high lax vowels are less pronounced in SAG than in SGG and still less pronounced in the speech of young SAG speakers as compared to old SAG speakers. The same result was observed for quantity distinctions: All speakers differentiate the high vowel pairs by quantity, meaning that the tense vowels of the type bieten, Hüte, and Buße are longer than their respective lax cognates. Again, the differences are most pronounced in SGG and least pronounced in the speech of the young SAG speakers, meaning that the tense vowels of the type bieten, Hüte, and Buße are truncated in the speech of young SAG speakers as compared to old SAG speakers and SGG speakers. Results on the quantity interactions of vowel + consonant sequences prove quantifying aspects in SAG. Again, some age-specific differences emerged insofar as overall, young SAG speakers have shorter durations than old SAG speakers. However, they maintain the timing relations observed for the old SAG speakers. Results on perception strongly suggest that SAG speakers make use of quantity cues in order to distinguish the vowel pairs, whereas SGG speakers rather rely on cues connected with quality. Generally, it can be concluded that quantity distinctions are more relevant in SAG than in SGG.

    Project Related Publications:

    Harrington, Jonathan, Hoole, Philip, & Reubold, Ulrich.(2012). A physiological analysis of high front, tense-lax vowel pairs in Standard Austrian and Standard German.Italian Journal of Linguistics, 24, 158-183.

    Brandstätter, Julia & Moosmüller, Sylvia. (in print).Neutralisierung der hohen Vokale in der Wiener Standardsprache – A sound change in progress? In M. Glauninger & A. Lenz (Eds.), Standarddeutsch in Österreich – Theoretische und empirische Ansätze.Vienna: Vandenhoeck & Ruprecht.

    Brandstätter, Julia, Kaseß, Christian H., & Moosmüller, Sylvia (accepted). Quality and quantity in high vowels in Standard Austrian German. In: A. Leemann, M-J. Kolly & V. Dellwo (Eds.), Trends in phonetics and phonology in German-speaking Europe. Zurich: Peter Lang.

    Cunha, Conceição, Harrington, Jonathan, Moosmüller, Sylvia, & Brandstätter, Julia (accepted). The influence of consonantal context on the tense-lax contrast in two standard varieties of German. In: A. Leemann, M-J. Kolly & V. Dellwo (Eds.), Trends in phonetics and phonology in German-speaking Europe.Zurich: Peter Lang.

    Moosmüller, Sylvia. (in print). Methodisches zur Bestimmung der Standardaussprache in Österreich. In: M. Glauninger & A. Lenz (Eds.), Standarddeutsch in Österreich – Theoretische und empirische Ansätze. Vienna: Vandenhoeck & Ruprecht (=Wiener Arbeiten zur Linguistik).

    Moosmüller, Sylvia & Brandstätter, Julia.(in print). Phonotactic Information in the temporal organisation of Standard Austrian German and the Viennese Dialect. Language Sciences.