Project

  • Objective:

    The Multiple Exponential Sweep Method (MESM) is an optimized method for the semi-simultaneous system identification of multiple systems. It uses an appropriate overlapping of the excitation signals. This leads to a faster identification of the weakly nonlinear systems that are retrieving the linear impulse response only. Using a Gabor multiplier in the post-processing procedure of the system identification may reduce the measurement noise. This may further improve the signal-to-noise ratio of the measured data.

    Method:

    A Gabor multiplier is used to cut the interesting parts out of the measured signals in the time-frequency plane. This allows a specific optimization of signal parts, independent of the frequency. Initial tests applying a Gabor multiplier to simulated data showed that the depth of spectral notches could be raised. A systematic investigation of this method is the main goal this project.

    Application:

    This method may improve the signal-to-noise ratio in system identification tasks of any weakly nonlinear system, such as those involving acoustic measurements with electric equipment.

    Publications:

    • P. Majdak, P. Balazs, B.Laback, "Multiple Exponential Sweep Method for Fast Measurement of Head Related Transfer Functions", Journal of the Acoustical Engineering Society , Vol. 55, No. 7/8, July/August 2007, Pages 623 - 637 (2007)

    Project-completion:

    This project ended on 28.02.2008 and is incorporated into the 'High Potential'-Project of the WWTF, MULAC (q.v.).

  • Objective:

    In Cooperation with National Instruments an implementation of MPEG4 features in the software package DIADEM is planned.

    Method:

    The application of MPEG4 features to noise is proven. Now the implementation of MPEG4 features into DIADEM is planned. In preparation of the project additional features were implemented into STX. The implementation into DIADEM is projected in the future.

    Application:

    DIADEM is a database that allows for a rapid search of measurement recordings. New search indexes can be generated based on the MPEG4 features of the recordings.

  • Basic Description:

    Time-variant filters are gaining importance in today's signal processing applications. Gabor multipliers in particular are popular in current scientific investigations. These multipliers are a specialization of Bessel multipliers to Gabor frames. These operators are interesting in regard to both theory and application:

    Theory of Multipliers:

    • Bessel and Frame Multipliers in Banach Spaces: In this project, the concept of frame multipliers should be generalized to work with Banach spaces.
    • Theory of Wavelet Multipliers: The concept of multipliers can be easily extended to wavelet frames. The influence of the special structures of these sequences will be investigated.
    • Basic Properties of Irregular Gabor Multipliers: Here multipliers for Gabor frames on irregular lattices are investigated.

    Application of Multipliers:

    • Time Frequency Masking: Gabor Multiplier Models and Evaluation: The symbol for the Gabor multiplier is calculated adaptively and the resulting model incorporates both time and frequency masking components. The goal is to obtain an algorithm using 2-D convolution.
    • Improving the Multiple Exponential Sweep Method (MESM) using Gabor Multipliers: The MESM is an efficient system identification method. Initial tests have shown that this method can be improved with a Gabor multiplier applied as a mask for the original sweep.
    • Wavelet Multipliers and Their Application to Reflection Measurements: One method to calculate the absorption coefficient of a sound proof wall requires separation of the impulse responses of different reflections. They can be easily separated in a scalogram and they can be extracted using a wavelet multiplier.
    • Mathematical Foundation of the Irrelevance Model: In this project, the theoretical foundation of the irrelevance algorithms implemented in STx is being developed.

    Partners:

    • H.G. Feichtinger, K. Gröchenig et al., NuHAG, Faculty of Mathematics, University of Vienna
    • R. Kronland-Martinet, S. Ytad, T. Necciari, Modélisation, Synthèse et Contrôle des Signaux Sonores et Musicaux of the LMA / CRNS Marseille
    • S. Meunier, S. Savel, Acoustique perceptive et qualité de l’environnement sonore of the LMA / CRNS Marseille

    Publications:

    • P. Balazs, B. Laback, G. Eckel, W. Deutsch, "Introducing Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking", IEEE Transactions on Audio, Speech and Language Processing, Vol. 17 (7) , in press (2009) , preprint
    • P. Majdak, P. Balazs, B.Laback, "Multiple Exponential Sweep Method for Fast Measurement of Head Related Transfer Functions", Journal of the Acoustical Engineering Society , Vol. 55, No. 7/8, July/August 2007, Pages 623 - 637 (2007)

    Project-completion:

    This project ended on 01.01.2010; most subprojects ended on 28.02.2008 and are incorporated into the 'High Potential'-Project of the WWTF, MULAC.

  • Basic Description:

    Signal processing has entered into today's life on a broad range, from mobile phones, UMTS, xDSL, and digital television to scientific research such as psychoacoustic modeling, acoustic measurements, and hearing prosthesis. Such applications often use time-invariant filters by applying the Fourier transform to calculate the complex spectrum. The spectrum is then multiplied by a function, the so-called transfer function. Such an operator can therefore be called a Fourier multiplier. Real life signals are seldom found to be stationary. Quasi-stationarity and fast-time variance characterize the majority of speech signals, transients in music, or environmental sounds, and therefore imply the need for non-stationary system models. Considerable progress can be achieved by reaching beyond traditional Fourier techniques and improving current time-variant filter concepts through application of the basic mathematical concepts of frame multipliers.

    Several transforms, such as the Gabor transform (the sampled version of the Short-Time Fourier Transformation), the wavelet transform, and the Bark, Mel, and Gamma tone filter banks are already in use in a large number of signal processing applications. Generalization of these techniques can be obtained via the mathematical frame theory. The advantage of introducing the frame theory consists particularly in the interpretability of filter and analysis coefficients in terms of frequency and time localization, as opposed to techniques based on orthonormal bases.

    One possibility to construct time-variant filters exists through the use of Gabor multipliers. For these operators the result of a Gabor transform is multiplied by a given function, called the time-frequency mask or symbol, followed by re-synthesis. These operators are already used implicitly in engineering applications, and have been investigated as Gabor filters in the fields of mathematics and signal processing theory. If alternative transforms are used, the concept of multipliers can be extended appropriately. So, for example, the concept of wavelet multipliers could be investigated for a wavelet transform.

    Different kinds of applications call for different frames. Multipliers can be generalized to the abstract level of frames without any further structure. This concept will be further investigated in this project. Its feasibility will be evaluated in acoustic applications using special cases of Gabor and wavelet systems.

    The project goal is to study both the mathematical theory of frame multipliers and their application among selected problems in acoustics. The project is divided into the following subprojects:

    Theory of Multipliers:

    1. General Frame Multiplier Theory
    2. Analytic and Numeric Properties of Gabor Multipliers
    3. Analytic and Numeric Properties of Wavelet Multipliers

    Application of Multipliers:

    1. Mathematical Modeling of Auditory Time-Frequency Masking Functions
    2. Improvement of Head-Related Transfer Function Measurements
    3. Advanced Method of Sound Absorption Measurements

    Partners:

    • H.G. Feichtinger et al., NuHAG, Faculty of Mathematics, University of Vienna
    • R. Kronland-Martinet et al., Modélisation, Synthèse et Contrôle des Signaux Sonores et Musicaux of the LMA / CNRS Marseille
    • B. Torrésani et al., LATP Université de Provence / CNRS Marseille
    • J.P. Antoine et al., FYMA Université Catholique de Louvain

    Publications:

    • P. Balazs, J.-P. Antoine, A. Gryboś, "Weighted and Controlled Frames: Mutual relationship and first Numerical Properties",  accepted for publication in International Journal of Wavelets, Multiresolution and Information Processing (2009), preprint
    • P. Balazs, “Matrix Representation of Bounded Linear Operators By Bessel Sequences, Frames and Riesz Sequence“,SampTA'09, 8th International Conference on Sampling and Applications, May 2009, Marseille, France
    • A. Rahimi, P. Balazs, "Multipliers for  p-Bessel sequences in Banach spaces", submitted (2009)
    • D. Stoeva, P. Balazs, "Unconditional convergence and Invertibility of Multipliers", preprint (2009)
    • Monika Dörfler and Bruno Torrésani, “Representation of operators in the time-frequency domain and generalized Gabor multipliers”, J. Fourier Anal. Appl., 2009 (in press)
    • Yohan Frutiger: "Multiplicateurs de Gabor pour les transformations sonores" (Gabor Multipliers for sound transformations) Master thesis under the supervision of R. Kronland-Martinet, June 2008 
    • F. Jaillet, P. Balazs, M. Dörfler and N. Engelputzeder, “On the Structure of the Phase around the Zeros of the Short-Time Fourier Transform”, NAG/DAGA 2009, International Conference on Acoustics, March 2009, Rotterdam, Nederland
    • F. Jaillet, P. Balazs and M. Dörfler, “Nonstationary Gabor Frames”, SampTA'09, 8th International Conference on Sampling and Applications, May 2009, Marseille, France
    • P. Balazs, B. Laback, G. Eckel, W. Deutsch, "Introducing Time-Frequency Sparsity by Removing Perceptually Irrelevant Components Using a Simple Model of Simultaneous Masking", IEEE Transactions on Audio, Speech and Language Processing (2009), in press
    •  B. Laback, P. Balazs, G. Toupin, T. Necciari, S. Savel, S. Meunier, S. Ystad and R. Kronland-Martinet, "Additivity of auditory masking using Gaussian-shaped tones", Acoustics'08, Paris, 29.06.-04.07.2008 (03.07.2008)
    • B. Laback, P. Balazs, T. Necciari, S. Savel, S. Ystad, S. Meunier and R. Kronland-Martinet, "Additivity of auditory masking for Gaussian-shaped tone pulses", preprint
    • Anaïk Olivero: "Expérimentation des multiplicateurs temps-échelle" (On the time-scale multipliers) Master thesis under the supervision of R. Kronland-Martinet and B. Torrésani, June 2008
  • Objective:

    The Multilevel Fast Multipole Method, when used in combination with the Boundary Element Method (BEM), is a tool to significantly speed up the simulation of large objects almost without loss in accuracy.

    Method:

    The Fast Multipole Method subdivides the Boundary Element mesh into different clusters. If two clusters are sufficiently far away from each other (i.e. they are in each other's far field), all calculations that would have to be made for every pair of nodes can be reduced to the midpoints of the clusters with almost no loss of accuracy. For clusters not in the far field, the traditional BEM has to be applied. The Multilevel Fast Multipole Method introduces different levels of clustering (clusters made out of smaller clusters) to additionally enhance computation speed.

    Application:

    The MLFFM is used for the simulation of head related transfer functions. The diagram above compares the result of a classical BEM with the MLFMM.

  • This area is involved with the analysis of the acoustics of music and with human perception thereof.

    In close cooperation with em.o.Univ.Prof. Dr. Franz Födermayr (Inst. of Musicology, Univ.Vienna) historic recordings of Georgian multipart songs are analyzed and transcribed.

  • Objective:

    An important difficulty of ray-tracing and boundary element method is the fine grid, which is needed in the high frequency region.

    Method:

    By means of new alternating shape functions e.g. wavelets at the boundary it could be possible to define a grid on the boundary that is independent from the wave number.

  • Noise Abatement: investigates the acoustic and psychoacoustic description of unwanted sounds and supports the specification of methods for reducing noise, from whatever source (Sound Quality Design).

    Perceiving sound as noise is a subjective reaction to disturbing acoustic signals. The intensity, pitch, sharpness, variation and roughness as well as the subjective attitude and motivation all play a role in the perceived noisiness. Railway noise is the main detractor when planning new high-speed tracks. The condition of the wheels and the track has a significant effect on the sound generation (see also: harmonisation). Literature:

    • NOIDESC: Deskriptoren von Lärmsignalen: Deutsch Werner A. & Waubke Holger (2004) .
    • Descriptoren für aircraft noise
    • Erschütterungen an Bahntrassen. Waubke Holger (2004).
    • Visualisierung von Bahnlärm (1996). AK08. in: Deutsch, Werner A. & Elisabeth Hilscher & Herta Spielmann (eds.): Tagungsband der Österreichischen Physikalischen Gesellschaft, Johannes Kepler, Universität Linz. Wien: Forschungsstelle für Schallforschung der Österreichischen Akademie der Wissenschaften, pp.27-29.
  • Objective and Methods:

    This study investigates the effect of the number of frequency channels on vertical place sound localization, especially front/back discrimination. This is important to determine how many of the basal-most channels/electrodes of a cochlear implant (CI) are needed to encode spectral localization cues. Normal hearing subjects listening to a CI simulation (the newly developed GET vocoder) will perform the experiment using the localization method developed in the subproject "Loca Methods". Learning effects will be studied by obtaining visual feedback.

    Results:

    Experiments are underway.

    Application:

    Knowing the number of channels required to encode spectral cues for localization in the vertical planes is an important step in the development of a 3-D localization strategy for CIs. 

    Funding:

    FWF (Austrian Science Fund): Project #P18401-B15

    Publications:

    • Goupell, M., Majdak, P., and Laback, B. (2010). Median-plane sound localization as a function of the number of spectral channels using a channel vocoder, J. Acoust. Soc. Am. 127, 990-1001.
  • Objective:

    During the current project of efficiently calculating a resynthesis window and an iterative scheme for a finite element method algorithm for vibrations in soils and liquids, it became apparent that block matrices are a powerful tool to find numerically efficient algorithms.

    Method:

    In this project, the focus should be the investigation of the numeric features of block matrices. How can this structure be used to calculate or approximate the inverse of a matrix or its norm? How can this be used to speed up iterative schemes?

    Application:

    The results will be used for the two projects mentioned below:

    • double preconditioning for Gabor frames
    • vibrations in random layers
  • Basic Description:

    Practical experience quickly revealed that the concept of an orthonormal basis is not always useful. This led to the concept of frames. Models in physics and other application areas (for example sound vibration analysis) are mostly continuous models. Many continuous model problems can be formulated as operator theory problems, such as in differential or integral equations. Operators provide an opportunity to describe scientific models, and frames provide a way to discretize them.

    Sequences are often used in physical models, allowing numerically unstable re- synthesis. This can be called an "unbounded frame". How this inversion can be regularized is being investigated. For many applications, a certain frame is very useful in describing the model. Therefore, it is also beneficial to use the same sequence to find a discretization of involved operators.

    Subprojects:

    Frames in Finite Dimensional Spaces:

    In this project, the theory of frames in the finite discrete case is investigated further.

    Matrix Representation of Operators using Frames:

    The standard matrix description of operators using orthonormal bases is extended to the more general case of frames.

    Weighted and Controlled Frames:

    Weighted and controlled frames were introduced to speed up the inversion algorithm for the frame matrix of a wavelet frame. In this project, these kinds of frames are investigated further.

    Basic Properties of Unbounded Frames

    Irregular Frames of Translates:

    In this project, one function's sequences of irregular shifts are investigated.

    Partners:

    • S. Heineken, Research Group on Real and Harmonic Analysis, University of Buenos Aires
    • J. P. Antoine, Unité de physique théorique et de physique mathématique – FYMA
    • M. El-Gebeily,  Department of Mathematical Sciences, King Fahd University of Petroleum and Minerals, Saudi Arabia
  • Objective:

    The modeling step in speaker detection has an enormous influence on the classification task, because the quality of the model depends on the parameters chosen in this step. False classifications, false identifications, and false verifications can result from malformed speaker models. The initial model parameters have an influence on the final determined parameters of the speaker models. To obtain optimized speaker models, different initialization methods are explored.

    Method:

    Speaker models are represented as Gaussian Mixture Models (GMMs). These models are mixtures of multivariate distributions that are parameterized by the means and the co-variance matrices of the distributions and the mixture weights. The parameters are estimated by the expectation maximization algorithm (EM algorithm) which maximizes the likelihood in the model. Initial model parameters have to be selected for this algorithm. Different initial parameters can lead to a convergence of the algorithm in local maximums. The effect of different initialization methods on the identification rate is analyzed.

    Application:

    Optimized speaker models reflect the speech behavior of the speakers in an optimal way. The inter-speaker variability is maximized while the intra-speaker variability is minimized by avoidance of malformed speaker models. The usage of optimal initialization methods improves the robustness and the reliability of automatic speaker identification and verification systems.

  • Objective:

    Methods to predict the propagation of vibrations in soil are relatively undeveloped. Reasons for this include the complexity of the wave propagation in soil and the insufficient knowledge of material parameters. During this project a method was developed to simulate the propagation of vibrations that are caused by a load at the base of a tunnel.

    Method:

    When dealing with the model of a tunnel in a semi-infinite domain like soil, the boundary element method (BEM) seems to be an appropriate tool. Unfortunately it cannot be applied directly to layered orthotropic media, because of the lack of a closed form of the Greens function, which is essential for BEM. But by transforming the whole system into the Fourier domain with respect to space and time, it is possible to numerically construct an approximation for this function on a predefined grid. With this approximation the boundary integral equation, that describes the propagation of waves caused by a vibrating load at the base of a tunnel can be solved.

    Application:

    Models that can help to predict the propagation of vibrations inside soil layers are of great interest in earthquake sciences or when constructing railway lines and tunnels.

  • Introduction

    Railway vehicles passing through tight curves can produce a high pitched noise called curve squeal. Curve squeal is a very salient type of noise located in the high frequency range that can range between a tonal narrow band and a wide band noise. The reason for the tonal noise is lateral creepage on the top of the rail, which excites wheel vibration at frequencies corresponding to their modes. Wide band noise, however, is caused by wheel flanges touching the rail.

    Aims

    The project PAAB aims at investigating the effect on the perceived annoyance of such noises using in a perception test. Using the resulting perceptual characterization of curve squeal should aid in more adequately considering this type of noise in noise mapping.

    Methods

    Based on previous conventional large-scale emission measurements as well as new measurements at immission distances using a head-and-torso-simulator representative samples for curve squeal will be derived and used in a perception test. This will also be aided by using synthetic well defined curve squeal noise.

    PAAB is funded by the FFG (project 860523) and the Austrian Federal Railways (ÖBB). The project is done in cooperation with the Research Center of Railway Engineering, Traffic Economics and Ropeways, Institute of Transportation, Vienna University of Technololgy (project leader), Kirisits Engineering Consultants, and psiacoustic Umweltforschung und Engineering GmbH.

     

     

  • Effects of the subthalamic stimulation on the characteristic of speech by parkinson patients.

  • Das Projekt PASS, welches in Kooperation mit dem IEW der TU Wien und psiacoustic GmbH durchgeführt wird, beschäftigt sich mit der psychoakustischen Bewertung von Lärm. Aufbauend auf den Ergebnissen des Projektes RELSKG werden dabei hohe und niedrige Lärmschutzwände numerisch simuliert mittels der 2.5 dimensionalen Randelemente Methode (2.5 D). Der Vergleich mit Messungen zeigt, dass die Annahme einer inkohärenten Linienquelle, wie sie mit der 2.5 D Methode möglich ist, für die Reproduktion der Messergebnisse erforderlich ist. Zusätzlich werden Schienenstegdämpfer aus Messdaten psychoakustisch bewertet. Die Bewertung erfolgt in zwei Tests mit 40 Probanden. Der erste Test vergleicht die relative Lästigkeit und der zweite die Schwellen für lästiger bzw. weniger lästig. Es ergab sich, dass Güterzüge bei gleichen A-Pegel als weniger lästig als Personenzüge eingestuft werden und dass bei gleichen A-Pegel der Lärm hinter einer Lärmschutzwand als geringfügig lästiger empfunden wird. Das Projekt starte in 2013 und läuft bis Ende 2014.

  • Objective:

    This project investigated the perception of interaural intensity differences among cochlear implant (CI) listeners in relation to the spectral composition and the temporal structure of the signal.

    Method:

    The perception thresholds (just noticeable differences, JND) of CI listeners were examined using differently structured signals. The stimuli were applied directly to the clinical signal processing units, while the parameters of the ongoing stimulation were closely monitored.

    Results:

    JNDs of IIDs in CI listeners ranged from 1.5 - 2.5 dB for a detection level of 80 percent. The type of stimulus seems to bear little relevance on the detection performance, with the exception of one single type of signal - a pulse train with a frequency of 20 Hz. This means that JNDs of CI listeners are only irrelevantly higher than those of normal hearing listeners. CI implantees are sensitive to IIDs, and the JNDs correlate to a difference in arrival angles ranging from 5-10 degrees. Since the JNDs are within the minimal level widths of the transfer of amplitudes by the CI system, the reduction of level width in future systems seems advisable.

    Publication:

    • Laback, B., Pok, S. M., Baumgartner, W. D., Deutsch, W. A., and Schmid, K. (2004). “Sensitivity to interaural level and envelope time differences of two bilateral cochlear implant listeners using clinical sound processors,” Ear and Hearing 25, 5, 488-500.
  • Objective and Methods:

    This project cluster includes several studies on the perception of interaural time differences (ITD) in cochlear implant (CI), hearing impaired (HI), and normal hearing (NH) listeners. Studying different groups of listeners allows for identification of the factors that are most important to ITD perception. Furthermore, the comparison between the groups allows for the development of strategies to improve ITD sensitivity in CI and HI listeners.

    Subprojects:

    • FsGd: Effects of ITD in Ongoing, Onset, and Offset in Cochlear Implant Listeners
    • ITD Sync: Effects of interaural time difference in fine structure and envelope on lateral discrimination in electric hearing
    • ITD Jitter CI: Recovery from binaural adaptation with cochlear implants
    • ITD Jitter NH: Recovery from binaural adaptation in normal hearing
    • ITD Jitter HI: Recovery from binaural adaptation with sensorineural hearing impairment
    • ITD CF: Effect of center frequency and rate on the sensitivity to interaural delay in high-frequency click trains
    • IID-CI: Perception of Interaural Intensity Differences by Cochlear Implant Listeners

       

  • Objective:

    In signal processing, synthesis is important in addition to analysis. This is especially true for the modification of data. For the Short-Time Fourier Transformation, the synthesis is often done using a simple overlap add (OLA), which is the sum of the outputs of the filter. Also, the output is re-weighted with the analysis window, such as occurs when using the phase vocoder. It is often presumed that with standard windows this will give satisfactory results.

    Aside from Gabor frame theory, if the well-known construction of synthesis windows was possible, it would guarantee perfect reconstruction. However, this method is not used often in signal processing algorithms.

    Method:

    In this project, we will systematically investigate if and for which parameters the respective OLA synthesis with the original window gives good reconstruction. We will compare it to the reconstruction with the dual window, introducing and motivating it as perfect reconstruction overlap add (PROLA). We will show that this method is always preferable to others and that it can be calculated very efficiently.

    Application:

    This is currently being implemented in STx. There the phase vocoder will have the option to guarantee perfect reconstruction, either with dual or tight windows.

    Partners:

    Department of Mathematics, University of Wisconsin-Eau Claire

  • Introduction:

    As is customary for urban varieties, the varieties of Vienna are predominantly social varieties. Education and social background form the primary factors which define the language behaviour of the speakers.

    The Viennese dialect belongs to the Middle Bavarian dialect group. Around the turn of the century, a sound change arose which monophthongized the diphthongs /aɛ/ and /ɑɔ/ to /æ:/ and /ɒ:/ repectively. This sound change was accomplished around 1950. As a result of the Viennese monophthongization, the palatal constriction location became overloaded. As early as the thirties, Kranzmayer observed what he called the "e-confusion", i.e., people stopped to discern the /e/-vowels, "Segen" (blessing) and "sehen" (to see) became homophones: [se:ŋ].

    Method:

    5 female and 5 male speakers of the Viennese dialect were asked to name pictures, to read sentences, and to speak spontaneously.

    Results:

    As a consequence of the Viennese monophthongization and the consecutive overcrowding of the palatal constriction location, speakers of the Viennese dialect developed two strategies. One group, in the sense Kranzmayer observed, neutralized /e/ and /ɛ/ to /e/. This neutralization made room for the new palatal vowel /æ/.

    The other group, however, preserved /e/ and /ɛ/, but sometimes applied the two vowels incorrectly, i.e., produced /ɛ/ instead of /e/ and the other way round. However, since no neutralization took place, the vowel /i/ is shifted to the pre-palatal constriction location. By this shift, room is created on the palatal bar for the new vowel /æ/.

    • Group I, consequently, discerns the following vowels:
    • palatal: /i:, i, e:, e, æ:/
    • velar: /u:, u/
    • uvular: /o:, o, ɔ:, ɔ/
    • pharyngeal: /ɑ:, ɑ, ɒ:/

    Group II discerns the vowels as follows:

    • pre-palatal: /i:, i/
    • palatal: /e:, e, ɛ:, ɛ, æ:/
    • velar: /u:, u/
    • uvular: /o:, o, ɔ:, ɔ/
    • pharyngeal: /ɑ:, ɑ, ɒ:/

    Lip rounding and duration is distinctive for each vowel system.