Research Topics/Groups

This is the companion Webpage of the manuscript:

Audlet Filter Banks: A Versatile Analysis/Synthesis Framework using Auditory Frequency Scales

Thibaud Necciari, Nicki Holighaus, Peter Balazs, Zdeněk Průša, Piotr Majdak, and Olivier Derrien.

Abstract: Many audio applications rely on filter banks (FBs) to analyze, process, and re-synthesize sounds. For these applications, an important property of the analysis-synthesis system is the reconstruction error; it has to be kept to a minimum to avoid audible artifacts. Other advantageous properties include stability and low redundancy. To exploit some aspects of human auditory perception in the signal chain, some applications rely on FBs that approximate the frequency analysis performed in the auditory periphery, the gammatone FB being a popular example. However, current gammatone FBs only allow partial reconstruction and stability at high redundancies. In this article, we construct an analysis-synthesis system for audio applications. The proposed system, named Audlet, is based on an oversampled FB with filters distributed on auditory frequency scales. It allows perfect reconstruction for a wide range of FB settings (e.g., the shape and density of filters), efficient FB design, and adaptable redundancy. In particular, we show how to construct a gammatone FB with perfect reconstruction. Experiments demonstrate performance improvements of the proposed gammatone FB when compared to current gammatone FBs in terms of reconstruction error and stability, especially at low redundancies. An application of the framework to audio source separation illustrates its utility for audio processing.

Sound examples for the source separation experiment: click on a system's acronym to hear the corresponding reconstruction.
Reference signals: original mixture -- target

Rt β = 1 β = 1/6 1024-point STFT
1.1 trev_gfb Audlet_gfb Audlet_hann trev_gfb Audlet_gfb Audlet_hann STFT_hann
1.5 trev_gfb Audlet_gfb Audlet_hann trev_gfb Audlet_gfb Audlet_hann STFT_hann
4.0 trev_gfb Audlet_gfb Audlet_hann trev_gfb Audlet_gfb Audlet_hann STFT_hann

Machine Learning

Machine learning has become an integral part of our everyday lives over the last few year. Whether we use a smartphone, shop online, consume media, drive a car or much more, machine learning (ML) and, more generally, artificial intelligence (AI) support, influence and analyze us in different life situations. In particular deep learning methods based on artificial neural networks are used in many areas.

Also in the sciences ML and AI have already generated important impulses and it is expected that this influence will spread in the future to an even wider field of scientific disciplines.

This increases both the interest in a deeper, science-based understanding of ML methods, as well as the need for scientists of various disciplines to develop a strong understanding of the application and design of such methods.

The Institute for Acoustic Research, which conducts application-oriented basic research in the field of acoustics, is rising to this challenge and as founded the Machine Learning research group.

It sheds light on the different aspects of machine learning and artificial intelligence, with a particular focus on potential applications in acoustics. The collaboration of scientists from different disciplines in the areas of ML and AI will not only enable the Institute for Acoustic Research to make pioneering progress in all areas of sound research, but will also make essential contributions to theoretical issues in the highly up-to-date research field of artificial intelligence.


Staff

Millions of people use headphones everyday for listening to music, for watching movies, or when communicating with others. Nevertheless, the sounds presented via headphones are usually perceived inside the head and not at their actual natural spatial position. This limited perception is inherent and results in unrealistic listening situations.

When listening to a sound without headphones, the acoustic information of the sound source is modified by our head and our torso, an effect described by the head-related transfer functions (HRTFs). The shape of our ears contributes to that modification by filtering the sound depending on the source direction. But the ear is very listener-specific – its individuality is similar to that of a finger print, and thus HRTFs are very listener-specific. When listening to sounds via headphones, the listener-specific filtering is usually not available. One of the main reasons is the difficulty in the process of acquisition of the ear shape of a person, and thus in calculation of listener-specific HRTFs.

Thus, in softpinna, we will work on the development of new methods for a better acquisition of listener-specific ear shapes of a person. Specifically, we will investigate and improve the so-called "non-rigid registration" (NRR) algorithms, applied on 3-D ear geometries calculated from 2-D photos of a person’s ears. The improvement in the quality of the 3-D ear geometries acquisition will allow computer programs to accurately calculate the listener-specific HRTFs, thus enabling the incorporation of listener-specific HRTFs in future headphone systems providing realistic presentation of spatial sounds. The new ear-shape acquisition method will vastly reduce the technical requirements for accurate calculation of listener-specific HRTFs.

This project is done in collaboration with Dreamwaves GmbH. It is supported by the Bridge Programme of the FFG

Phonetik bei der Österreichischen Linguistiktagung

Seit der 37. Österreichischen Linguistiktagung 2009 ist das Institut für Schallforschung regelmäßig auf Österreichischen Linguistiktagungen (ÖLT) mit phonetischen Themen zu Gast. Seit 2013 wird durchgehend ein eigener Workshop zum Thema Phonetik im Rahmen der ÖLT abgehalten. Um allen Interessierten einen Ort zur Nachschau sowie Ankündigung zukünftiger Ereignisse zu bieten, werden hier Informationen zu den phonetischen Workshops der ÖLT gesammelt und zugänglich gemacht.

Aktuelles

Nächste ÖLT: 46. Österreichische Linguistiktagung, 4.-6. Dezember 2020, Wien

Call Phonetikworkshop ÖLT 2020

Tagungswebsite 46. ÖLT 2020

 

Frühere Tagungen

Hier finden Sie Informationen zu den phonetischen Workshops bei der ÖLT der vergangenen Jahre.

45. Österreichische Linguistiktagung 2019, Salzburg

"Phonetik in und über Österreich 2019"

Organisation: Nicola Klingler, Hannah Leykum, Jan Luttenberger, Michael Pucher, Carolin Schmid (Institut für Schallforschung, Österreichische Akademie der Wissenschaften) und Johanna Fanta-Jende (Institut für Germanistik, Universität Wien)

Call for Abstracts 45. ÖLT 2019

Programm 45. ÖLT 2019 Freitag 06.12.2019

Programm 45. ÖLT 2019 Samstag 07.12.2019

44. Österreichische Linguistiktagung 2018, Innsbruck

"Phonetik und Sprachtechnologie"

Organisation: Nicola Klingler, Hannah Leykum und Michael Pucher (Institut für Schallforschung, Österreichische Akademie der Wissenschaften)

Call for Abstracts 44. ÖLT 2018

43. Österreichische Linguistiktagung 2017, Klagenfurt

"Phonetik in und über Österreich"

Leitung: Sylvia Moosmüller† (Institut für Schallforschung, Österreichische Akademie der Wissenschaften)
Koordination: Michaela Rausch-Supola (Institut für Schallforschung, Österreichische Akademie der Wissenschaften)

Call for Abstracts 43. ÖLT 2017

42. Österreichische Linguistiktagung 2016, Graz

"Phonetik & Phonologie"
Organisation: Dina El Zarka, Petra Hödl, Ralf Vollmann (Universität Graz)

Programm 42. ÖLT 2016

41. Österreichische Linguistiktagung 2014, Wien

"Phonetik in und über Österreich"
Leitung: Sylvia Moosmüller† und Carolin Schmid (Institut für Schallforschung, Österreichische Akademie der Wissenschaften)

Call for Abstracts 41. ÖLT 2014
Programm Workshop 41. ÖLT 2014

40. Österreichische Linguistiktagung 2013, Salzburg

"Arbeitsgemeinschaft Soziophonetik"
Leitung: Manfred Sellner (Universität Salzburg)

Programm 40. ÖLT 2013

Computational Hearing and Psychoacoustics investigates several areas which rely on human hearing:

  • Psychoacoustics (proper): is concerned with the perception of sound in general. Main topics include pitch timbre, loudness and temporal aspects of sounds.
  • Noise Abatement: investigates the acoustic and psychoacoustic description of unwanted sounds and supports the specification of methods for reducing noise, from whatever source (Sound Quality Design).
  • Speech and Language Processing: this area is involved with acoustic aspects of phonetics and linguistics.
  • Comparative and Systematic Musicology: application of psychoacoustic models in the acoustic analysis of music and the human perception thereof.

Contact: This email address is being protected from spambots. You need JavaScript enabled to view it.

Publications: W. Deutsch

Current projects