Research Topics/Groups

This is the companion Webpage of the manuscript:

Audlet Filter Banks: A Versatile Analysis/Synthesis Framework using Auditory Frequency Scales

Thibaud Necciari, Nicki Holighaus, Peter Balazs, Zdeněk Průša, Piotr Majdak, and Olivier Derrien.

Abstract: Many audio applications rely on filter banks (FBs) to analyze, process, and re-synthesize sounds. For these applications, an important property of the analysis-synthesis system is the reconstruction error; it has to be kept to a minimum to avoid audible artifacts. Other advantageous properties include stability and low redundancy. To exploit some aspects of human auditory perception in the signal chain, some applications rely on FBs that approximate the frequency analysis performed in the auditory periphery, the gammatone FB being a popular example. However, current gammatone FBs only allow partial reconstruction and stability at high redundancies. In this article, we construct an analysis-synthesis system for audio applications. The proposed system, named Audlet, is based on an oversampled FB with filters distributed on auditory frequency scales. It allows perfect reconstruction for a wide range of FB settings (e.g., the shape and density of filters), efficient FB design, and adaptable redundancy. In particular, we show how to construct a gammatone FB with perfect reconstruction. Experiments demonstrate performance improvements of the proposed gammatone FB when compared to current gammatone FBs in terms of reconstruction error and stability, especially at low redundancies. An application of the framework to audio source separation illustrates its utility for audio processing.

Sound examples for the source separation experiment: click on a system's acronym to hear the corresponding reconstruction.
Reference signals: original mixture -- target

Rt β = 1 β = 1/6 1024-point STFT
1.1 trev_gfb Audlet_gfb Audlet_hann trev_gfb Audlet_gfb Audlet_hann STFT_hann
1.5 trev_gfb Audlet_gfb Audlet_hann trev_gfb Audlet_gfb Audlet_hann STFT_hann
4.0 trev_gfb Audlet_gfb Audlet_hann trev_gfb Audlet_gfb Audlet_hann STFT_hann

AABBA is an intellectual open group of scientists collaborating on development and applications of models of human spatial hearing

AABBA's goal is to promote exploration and development of binaural and spatial models and their applications.

AABBA members are academic scientists willing to participate in our activities. We meet annually for an open discussion and progress presentation, especially encouraging to bring in students and young scientists associated with members’ projects to our meetings. Our activities consolidate in joint publications and special sessions at international conferences. As a relevant tangible outcome, we provide validated (source) codes for published models of binaural and spatial hearing to our collection of auditory models, known as the auditory modeling toolbox (AMT).

Structure

  • Executive board: Piotr Majdak, Armin Kohlrausch, Ville Pulkki

  • Members:

    • Aachen: Janina Fels, ITA, RWTH Aachen
    • Bochum: Dorothea Kolossa & Jens Blauert, Ruhr-Universität Bochum
    • Cardiff: John Culling, School of Psychology, Cardiff University
    • Copenhagen: Torsten Dau & Tobias May, DTU, Lyngby
    • Dresden: Ercan Altinsoy, TU Dresden
    • Ghent: Sarah Verhulst, Ghent University
    • Guangzhou: Bosun Xie, South China University of Technology, Guangzhou
    • Helsinki: Ville Pulkki & Nelli Salminen, Aalto University
    • Ilmenau: Alexander Raake, TU Ilmenau
    • Kosice: Norbert Kopčo, Safarik University, Košice
    • London: Lorenzo Picinali, Imperial College, London
    • Lyon: Mathieu Lavandier, Université de Lyon
    • Munich I: Werner Hemmert, TUM München
    • Munich II: Bernhard Seeber, TUM München 
    • Oldenburg I: Bernd Meyer, Carl von Ossietzky Universität Oldenburg
    • Oldenburg II: Mathias Dietz, Carl von Ossietzky Universität Oldenburg
    • Oldenburg-Eindhoven: Steven van de Par & Armin Kohlrausch, Universität Oldenburg
    • Paris: Brian Katz, Sorbonne Université
    • Patras: John Mourjopoulos, University of Patras
    • Rostock: Sascha Spors, Universität Rostock
    • Sheffield: Guy Brown, The University of Sheffield
    • Tabriz: Masoud Geravanchizadeh, University of Tabriz
    • Toulouse: Patrick Danès, Université de Toulouse
    • Troy: Jonas Braasch, Rensselaer Polytechnic Institute, Troy
    • Vienna: Bernhard Laback & Robert Baumgartner, Austrian Academy of Sciences, Wien
    • The AMT (Umbrella Project): Piotr Majdak
AABBA Group 2019
AABBA group as of the 11th meeting 2019 in Vienna.

Meetings

Annual meetings are held at the beginning of each year:

  • 12th meeting: 16-17 January 2020, Vienna
  • 11th meeting: 19-20 February 2019, Vienna. Schedule.
  • 10th meeting: 30-31 January 2018, Vienna. Schedule. Group photo
  • 9th meeting: 27-28 February 2017, Vienna. Schedule.
  • 8th meeting: 21-22 January 2016, Vienna. Schedule.
  • 7th meeting: 22-23 February 2015, Berlin.
  • 6th meeting: 17-18 February 2014, Berlin.
  • 5th meeting: 24-25 January 2013, Berlin.
  • 4th meeting: 19-20 January 2012, Berlin.
  • 3rd meeting: 13-14 January 2011, Berlin.
  • 2nd meeting: 29-30 September 2009, Bochum.
  • 1st meeting: 23-26 March 2009, Rotterdam.

Activities

  • Upcoming: Structured Session "Binaural models: development and applications" at the Forum Acusticum 2020, Lyon.
  • Special Session "Binaural models: development and applications" at the ICA 2019, Aachen.
  • Special Session "Models and reproducible research" at the Acoustics'17 (EAA/ASA) 2017, Boston.
  • Structured Session "Applied Binaural Signal Processing" at the Forum Acusticum 2014, Krakòw.
  • Structured Session "The Technology of Binaural Listening & Understanding" at the ICA 2016, Buenos Aires.

Contact person: Piotr Majdak

Machine Learning

Machine learning has become an integral part of our everyday lives over the last few year. Whether we use a smartphone, shop online, consume media, drive a car or much more, machine learning (ML) and, more generally, artificial intelligence (AI) support, influence and analyze us in different life situations. In particular deep learning methods based on artificial neural networks are used in many areas.

Also in the sciences ML and AI have already generated important impulses and it is expected that this influence will spread in the future to an even wider field of scientific disciplines.

This increases both the interest in a deeper, science-based understanding of ML methods, as well as the need for scientists of various disciplines to develop a strong understanding of the application and design of such methods.

The Institute for Acoustic Research, which conducts application-oriented basic research in the field of acoustics, is rising to this challenge and as founded the Machine Learning research group.

It sheds light on the different aspects of machine learning and artificial intelligence, with a particular focus on potential applications in acoustics. The collaboration of scientists from different disciplines in the areas of ML and AI will not only enable the Institute for Acoustic Research to make pioneering progress in all areas of sound research, but will also make essential contributions to theoretical issues in the highly up-to-date research field of artificial intelligence.


Staff

The auditory system constantly monitors the environment to protect us from harmful events such as collisions with approaching objects. Auditory looming bias is an astoundingly fast perceptual bias favoring approaching compared to receding auditory motion and was demonstrated behaviorally even in infants of four months in age. The role of learning in developing this perceptual bias and its underlying mechanisms are yet to be investigated. Supervised learning and statistical learning are the two distinct mechanisms enabling neural plasticity. In the auditory system, statistical learning refers to the implicit ability to extract and represent regularities, such as frequently occurring sound patterns or frequent acoustic transitions, with or without attention while supervised learning refers to the ability to attentively encode auditory events based on explicit feedback. It is currently unclear how these two mechanisms are involved in learning auditory spatial cues at different stages of life. While newborns already possess basic skills of spatial hearing, adults are still able to adapt to changing circumstances such as modifications of spectral-shape cues. Spectral-shape cues are naturally induced when the complex geometry especially of the human pinna shapes the spectrum of an incoming sound depending on its source location. Auditory stimuli lacking familiarized spectral-shape cues are often perceived to originate from inside the head instead of perceiving them as naturally external sound sources. Changes in the salience or familiarity of spectral-shape cues can thus be used to elicit auditory looming bias. The importance of spectral-shape cues for both auditory looming bias and auditory plasticity makes it ideal for studying them together.

Born2Hear project overview

Born2Hear will combine auditory psychophysics and neurophysiological measures in order to 1) identify auditory cognitive subsystems underlying auditory looming bias, 2) investigate principle cortical mechanisms for statistical and supervised learning of auditory spatial cues, and 3) reveal cognitive and neural mechanisms of auditory plasticity across the human lifespan. These general research questions will be addressed within three studies. Study 1 will investigate the differences in the bottom-up processing of different spatial cues and the top-down attention effects on auditory looming bias by analyzing functional interactions between brain regions in young adults and then test in newborns whether these functional interactions are innate. Study 2 will investigate the cognitive and neural mechanisms of supervised learning of spectral-shape cues in young and older adults based on an individualized perceptual training on sound source localization. Study 3 will focus on the cognitive and neural mechanisms of statistical learning of spectral-shape cues in infants as well as young and older adults.

Project investigator (PI): Robert Baumgartner

Project partner / Co-PI: Brigitta Tóth, Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary

Collaboration partners:

Supported by Austrian Science Fund (FWF, I 4294-B) and NKFIH.

 

Computational Hearing and Psychoacoustics investigates several areas which rely on human hearing:

  • Psychoacoustics (proper): is concerned with the perception of sound in general. Main topics include pitch timbre, loudness and temporal aspects of sounds.
  • Noise Abatement: investigates the acoustic and psychoacoustic description of unwanted sounds and supports the specification of methods for reducing noise, from whatever source (Sound Quality Design).
  • Speech and Language Processing: this area is involved with acoustic aspects of phonetics and linguistics.
  • Comparative and Systematic Musicology: application of psychoacoustic models in the acoustic analysis of music and the human perception thereof.

Contact: This email address is being protected from spambots. You need JavaScript enabled to view it.

Publications: W. Deutsch

Current projects