POTION: Perceptual optimization of time-frequency audio representations and coding

Principal investigators: Thibaud Necciari, Piotr Majdak
Co-applicants: Bernhard Laback, Peter Balazs
Collaborators: Olivier Derrien, Richard Kronland-Martinet, Sølvi Ystad (Laboratoire de Mécanique et d'Acoustique, CNRS, France).

Tentative start of the Austrian project part: 1st March 2014

Project goals: The fundamental research in POTION aims at developing new methods for the representation and interpretation of audio signals. More specifically, we propose the development of an efficient, perfectly invertible, possibly non-redundant, and perceptually optimized time-frequency representation, i.e., which displays only the audible components of sound signals. To our knowledge, such a representation is not available. The originality of POTION lies in the consideration of the mathematical theory of time-frequency representations and its application to signal processing, and psychoacoustical data on auditory time-frequency masking. The technical objective of POTION is to develop a new perceptual audio codec based on the combination of new results on non-stationary time-frequency transforms and on time-frequency masking. This codec will constitute the “end-product” of the project. While current audio codecs are mainly based on a frequency approach, the codec created in POTION will consider a joint time-frequency approach. It is therefore expected to produce higher compression ratios than current codecs at the same perceived audio quality.