Completed

  • Noise Abatement: investigates the acoustic and psychoacoustic description of unwanted sounds and supports the specification of methods for reducing noise, from whatever source (Sound Quality Design).

    Perceiving sound as noise is a subjective reaction to disturbing acoustic signals. The intensity, pitch, sharpness, variation and roughness as well as the subjective attitude and motivation all play a role in the perceived noisiness. Railway noise is the main detractor when planning new high-speed tracks. The condition of the wheels and the track has a significant effect on the sound generation (see also: harmonisation). Literature:

    • NOIDESC: Deskriptoren von Lärmsignalen: Deutsch Werner A. & Waubke Holger (2004) .
    • Descriptoren für aircraft noise
    • Erschütterungen an Bahntrassen. Waubke Holger (2004).
    • Visualisierung von Bahnlärm (1996). AK08. in: Deutsch, Werner A. & Elisabeth Hilscher & Herta Spielmann (eds.): Tagungsband der Österreichischen Physikalischen Gesellschaft, Johannes Kepler, Universität Linz. Wien: Forschungsstelle für Schallforschung der Österreichischen Akademie der Wissenschaften, pp.27-29.
  • Objective and Methods:

    This study investigates the effect of the number of frequency channels on vertical place sound localization, especially front/back discrimination. This is important to determine how many of the basal-most channels/electrodes of a cochlear implant (CI) are needed to encode spectral localization cues. Normal hearing subjects listening to a CI simulation (the newly developed GET vocoder) will perform the experiment using the localization method developed in the subproject "Loca Methods". Learning effects will be studied by obtaining visual feedback.

    Results:

    Experiments are underway.

    Application:

    Knowing the number of channels required to encode spectral cues for localization in the vertical planes is an important step in the development of a 3-D localization strategy for CIs. 

    Funding:

    FWF (Austrian Science Fund): Project #P18401-B15

    Publications:

    • Goupell, M., Majdak, P., and Laback, B. (2010). Median-plane sound localization as a function of the number of spectral channels using a channel vocoder, J. Acoust. Soc. Am. 127, 990-1001.
  • Basic Description:

    Practical experience quickly revealed that the concept of an orthonormal basis is not always useful. This led to the concept of frames. Models in physics and other application areas (for example sound vibration analysis) are mostly continuous models. Many continuous model problems can be formulated as operator theory problems, such as in differential or integral equations. Operators provide an opportunity to describe scientific models, and frames provide a way to discretize them.

    Sequences are often used in physical models, allowing numerically unstable re- synthesis. This can be called an "unbounded frame". How this inversion can be regularized is being investigated. For many applications, a certain frame is very useful in describing the model. Therefore, it is also beneficial to use the same sequence to find a discretization of involved operators.

    Subprojects:

    Frames in Finite Dimensional Spaces:

    In this project, the theory of frames in the finite discrete case is investigated further.

    Matrix Representation of Operators using Frames:

    The standard matrix description of operators using orthonormal bases is extended to the more general case of frames.

    Weighted and Controlled Frames:

    Weighted and controlled frames were introduced to speed up the inversion algorithm for the frame matrix of a wavelet frame. In this project, these kinds of frames are investigated further.

    Basic Properties of Unbounded Frames

    Irregular Frames of Translates:

    In this project, one function's sequences of irregular shifts are investigated.

    Partners:

    • S. Heineken, Research Group on Real and Harmonic Analysis, University of Buenos Aires
    • J. P. Antoine, Unité de physique théorique et de physique mathématique – FYMA
    • M. El-Gebeily,  Department of Mathematical Sciences, King Fahd University of Petroleum and Minerals, Saudi Arabia
  • Effects of the subthalamic stimulation on the characteristic of speech by parkinson patients.

  • Objective:

    This project investigated the perception of interaural intensity differences among cochlear implant (CI) listeners in relation to the spectral composition and the temporal structure of the signal.

    Method:

    The perception thresholds (just noticeable differences, JND) of CI listeners were examined using differently structured signals. The stimuli were applied directly to the clinical signal processing units, while the parameters of the ongoing stimulation were closely monitored.

    Results:

    JNDs of IIDs in CI listeners ranged from 1.5 - 2.5 dB for a detection level of 80 percent. The type of stimulus seems to bear little relevance on the detection performance, with the exception of one single type of signal - a pulse train with a frequency of 20 Hz. This means that JNDs of CI listeners are only irrelevantly higher than those of normal hearing listeners. CI implantees are sensitive to IIDs, and the JNDs correlate to a difference in arrival angles ranging from 5-10 degrees. Since the JNDs are within the minimal level widths of the transfer of amplitudes by the CI system, the reduction of level width in future systems seems advisable.

    Publication:

    • Laback, B., Pok, S. M., Baumgartner, W. D., Deutsch, W. A., and Schmid, K. (2004). “Sensitivity to interaural level and envelope time differences of two bilateral cochlear implant listeners using clinical sound processors,” Ear and Hearing 25, 5, 488-500.
  • Objective and Methods:

    This project cluster includes several studies on the perception of interaural time differences (ITD) in cochlear implant (CI), hearing impaired (HI), and normal hearing (NH) listeners. Studying different groups of listeners allows for identification of the factors that are most important to ITD perception. Furthermore, the comparison between the groups allows for the development of strategies to improve ITD sensitivity in CI and HI listeners.

    Subprojects:

    • FsGd: Effects of ITD in Ongoing, Onset, and Offset in Cochlear Implant Listeners
    • ITD Sync: Effects of interaural time difference in fine structure and envelope on lateral discrimination in electric hearing
    • ITD Jitter CI: Recovery from binaural adaptation with cochlear implants
    • ITD Jitter NH: Recovery from binaural adaptation in normal hearing
    • ITD Jitter HI: Recovery from binaural adaptation with sensorineural hearing impairment
    • ITD CF: Effect of center frequency and rate on the sensitivity to interaural delay in high-frequency click trains
    • IID-CI: Perception of Interaural Intensity Differences by Cochlear Implant Listeners

       

  • Objective:

    In signal processing, synthesis is important in addition to analysis. This is especially true for the modification of data. For the Short-Time Fourier Transformation, the synthesis is often done using a simple overlap add (OLA), which is the sum of the outputs of the filter. Also, the output is re-weighted with the analysis window, such as occurs when using the phase vocoder. It is often presumed that with standard windows this will give satisfactory results.

    Aside from Gabor frame theory, if the well-known construction of synthesis windows was possible, it would guarantee perfect reconstruction. However, this method is not used often in signal processing algorithms.

    Method:

    In this project, we will systematically investigate if and for which parameters the respective OLA synthesis with the original window gives good reconstruction. We will compare it to the reconstruction with the dual window, introducing and motivating it as perfect reconstruction overlap add (PROLA). We will show that this method is always preferable to others and that it can be calculated very efficiently.

    Application:

    This is currently being implemented in STx. There the phase vocoder will have the option to guarantee perfect reconstruction, either with dual or tight windows.

    Partners:

    Department of Mathematics, University of Wisconsin-Eau Claire

  • Introduction:

    As is customary for urban varieties, the varieties of Vienna are predominantly social varieties. Education and social background form the primary factors which define the language behaviour of the speakers.

    The Viennese dialect belongs to the Middle Bavarian dialect group. Around the turn of the century, a sound change arose which monophthongized the diphthongs /aɛ/ and /ɑɔ/ to /æ:/ and /ɒ:/ repectively. This sound change was accomplished around 1950. As a result of the Viennese monophthongization, the palatal constriction location became overloaded. As early as the thirties, Kranzmayer observed what he called the "e-confusion", i.e., people stopped to discern the /e/-vowels, "Segen" (blessing) and "sehen" (to see) became homophones: [se:ŋ].

    Method:

    5 female and 5 male speakers of the Viennese dialect were asked to name pictures, to read sentences, and to speak spontaneously.

    Results:

    As a consequence of the Viennese monophthongization and the consecutive overcrowding of the palatal constriction location, speakers of the Viennese dialect developed two strategies. One group, in the sense Kranzmayer observed, neutralized /e/ and /ɛ/ to /e/. This neutralization made room for the new palatal vowel /æ/.

    The other group, however, preserved /e/ and /ɛ/, but sometimes applied the two vowels incorrectly, i.e., produced /ɛ/ instead of /e/ and the other way round. However, since no neutralization took place, the vowel /i/ is shifted to the pre-palatal constriction location. By this shift, room is created on the palatal bar for the new vowel /æ/.

    • Group I, consequently, discerns the following vowels:
    • palatal: /i:, i, e:, e, æ:/
    • velar: /u:, u/
    • uvular: /o:, o, ɔ:, ɔ/
    • pharyngeal: /ɑ:, ɑ, ɒ:/

    Group II discerns the vowels as follows:

    • pre-palatal: /i:, i/
    • palatal: /e:, e, ɛ:, ɛ, æ:/
    • velar: /u:, u/
    • uvular: /o:, o, ɔ:, ɔ/
    • pharyngeal: /ɑ:, ɑ, ɒ:/

    Lip rounding and duration is distinctive for each vowel system.

  • Objective:

    Pitch and timbre are closely interrelated. Both determine the perception of complex tones. Both pitch and timbre variations characterize realistic signals. Particularly via diphthongs, pitch and timbre changes occur simultaneously and continuously.

    Method:

    The slow (e.g. 0.5/s) and triangle frequency modulations (range: 1 octave) of a harmonic sound with the fundamental frequency of 220 Hz produces a specific pitch phenomenon. If one of the resolved partials is accentuated by a sharp onset, this partial gives rise to a temporary spectral pitch according to its position on the frequency continuum. At the same time, the pitch movement of the complex tone continues. After a short transition period of approxiamety 100 ms the partial loses its accentuated spectral pitch and is completely integrated into the timbre and pitch movement of the complex sound.

    Application:

    The purpose of the present pilot study was to explore starting points for the determination and explanation of a new pitch glide transition and pitch ambiguity effect which occurs when a continuous varying pitch percept of a complex tone is interrupted by onset transients of emerging harmonic partials in successive order, followed by momentarily dominating spectral pitches of the corresponding harmonics. Immediately after the appearance of the initial spectral pitch dominance, which is in concurrence to the pitch of the complex tone, the latter is reinstalled by integrating the harmonic into timbre in a smoothly gliding manner.

    References:

    PACS: 43.66.Hg; Pitch perception.

  • Objective:

    The identification of the parameters of the vocal tract system can be used for speaker identification.

    Method:

    A preferred speech coding technique is the so-called Model-Based Speech Coding (MBSC), which involves modeling the vocal tract as a linear time-variant system (synthesis filter). The system's input is either white noise or a train of impulses. For coding purposes, the synthesis filter is assumed to be time-invariant during a short time interval (time slot) of typically 10-20 msec. Then, the signal is represented by the coefficients of the synthesis filter corresponding to each time slot.

    A successful MBSC method is the so-called Linear Prediction Coding (LPC). Roughly speaking, the LPC technique models the synthesis filter as an all-pole linear system. This all-pole linear system has coefficients obtained by adapting a predictor of the output signal, based on its own previous samples. The use of an all-pole model provides a good representation for the majority of speech sounds. However, the representation of nasal sounds, fricative sounds, and stop consonants requires the use of a zero-pole model. Also, the LPC technique is not adequate when the voice signal is corrupted by noise.

    We propose a method to estimate a zero-pole model which is able to provide the optimal synthesis filter coefficients, numerically efficient and optimal when minimizing a logarithm criterion.

    Evaluation:

    In order to evaluate the perceptual relevance of the proposed method, we used the model estimated from a speech signal to re-synthesis it:

    Re-Synthesized Sound

    Original Sound

    Publications:

  • French-Austrian bilateral research project funded by the French National Agency of Research (ANR) and the Austrian Science Fund (FWF, project no. I 1362-N30). The project involves two academic partners, namely the Laboratory of Mechanics and Acoustics (LMA - CNRS UPR 7051, France) and the Acoustics Research Institute. At the ARI, two research groups are involved in the project: the Mathematics and Signal Processing in Acoustics and the Psychoacoustics and Experimental Audiology groups.

    Principal investigators: Thibaud Necciari (ARI), Piotr Majdak (ARI) and Olivier Derrien (LMA).

    Running period: 2014-2017 (project started on March 1, 2014).

    Abstract:

    One of the greatest challenges in signal processing is to develop efficient signal representations. An efficient representation extracts relevant information and describes it with a minimal amount of data. In the specific context of sound processing, and especially in audio coding, where the goal is to minimize the size of binary data required for storage or transmission, it is desirable that the representation takes into account human auditory perception and allows reconstruction with a controlled amount of perceived distortion. Over the last decades, many psychoacoustical studies investigated auditory masking, an important property of auditory perception. Masking refers to the degradation of the detection threshold of a sound in presence of another sound. The results were used to develop models of either spectral or temporal masking. Attempts were made to simply combine these models to account for time-frequency (t-f) masking effects in perceptual audio codecs. We recently conducted psychoacoustical studies on t-f masking. They revealed the inaccuracy of those models which revealed the inaccuracy of such simple models. These new data on t-f masking represent a crucial basis to account for masking effects in t-f representations of sounds. Although t-f representations are standard tools in audio processing, the development of a t-f representation of audio signals that is mathematically-founded, perception-based, perfectly invertible, and possibly with a minimum amount of redundancy, remains a challenge. POTION thus addresses the following questions:

    1. To what extent is it possible to obtain a perception-based (i.e., as close as possible to “what we see is what we hear”), perfectly invertible, and possibly minimally redundant t-f representation of sound signals? Such a representation is essential for modeling complex masking interactions in the t-f domain and is expected to improve our understanding of auditory processing of real-world sounds. Moreover, it is of fundamental interest for many audio applications involving sound analysis-synthesis.
    2. Is it possible to improve current perceptual audio codecs by considering a joint t-f approach? To reduce the size of digital audio files, perceptual audio codecs like MP3 decompose sounds into variable-length time segments, apply a frequency transform, and use masking models to control the sub-quantization of transform coefficients within each segment. Thus, current codecs follow mainly a spectral approach, although temporal masking effects are taken into account in some implementations. By combining an efficient perception-based t-f transform with a joint t-f masking model in an audio codec, we expect to achieve significant performance improvements.

    Working program:

    POTION is structured in three main tasks:

    1. Perception-based t-f representation of audio signals with perfect reconstruction: A linear and perfectly invertible t-f representation will be created by exploiting the recently developed non-stationary Gabor theory as a mathematical background. The transform will be designed so that t-f resolution mimics the t-f analysis properties by the auditory system and possibly no redundancy is introduced to maximize the coding efficiency.
    2. Development and implementation of a t-f masking model: Based on psychoacoustical data on t-f masking collected by the partners in previous projects and on literature data, a new, complex model of t-f masking will be developed and implemented in the computationally efficient representation built in task 1. Additional psychoacoustical data required for the development of the model, involving frequency, level, and duration effects in masking for either single or multiple maskers will be collected. The resulting signal processing algorithm should represent and re-synthesize only the perceptually relevant components of the signal. It will be calibrated and validated by conducting listening tests with synthetic and real-world sounds.
    3. Optimization of perceptual audio codecs: This task represents the main application of POTION. It will consist in combining the new efficient representation built in task 1 with the new t-f masking model built in task 2 for implementation in a perceptual audio codec.

    More information on the project can be found on the POTION web page.

    Publications:

    • Chardon, G., Necciari, Th., Balazs, P. (2014): Perceptual matching pursuit with Gabor dictionaries and time-frequency masking, in: Proceedings of the 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014). Florence, Italy, 3126-3130. (proceedings) ICASSP 2014: Perceptual matching pursuit results

    Related topics investigated at the ARI:

  • Objective:

    If measurements are possible only at the hull of a machine, a tool is needed to separate the dominating near-field components from the far-field components. This, in turn, allows the far-field levels to be estimated. The separation is often not possible using spectral methods, because both components have nearly the same frequency. Using a limited number of microphones, a modal separation is also impossible. Instead of a modal analysis, a principal component analysis is applied.

    Method:

    The narrow-band Fourier transform method is used, and a separate analysis is conducted for each frequency. The cross-power matrix spanning all microphone positions is used. The components are then calculated using the PCA. As long as the modes at the microphone positions have different relative values, PCA can be used to separate them. In an initial test, the far field is observed and the transfer function for every component from the near field to the far field is estimated. These transfer functions are assumed to be constant in time. They are used for the estimation of the overall far-field level.

    Application:

    Observation of the far-field level of machines.

  • Objective:

    The sensitivity of normal hearing listeners to interaural time differences (ITD) in the envelope of high-frequency carriers is limited with respect to the envelope modulation rate. Increasing the envelope rate reduces the sensitivity, an effect that has been termed binaural adaptation (Hafter and Dye, 1983). Cochlear implant (CI) listeners show a similar limitation in ITD sensitivity with respect to the rate of unmodulated pulse trains containing ITD. Unfortunately, such high rates are needed to appropriately sample the modulation information of the acoustic signal. This study tests the ideas that (1) similar "binaural adaptation" mechanisms are limiting the performance in both subject groups, (2) the effect is related to the periodicity of pulse trains, and (3) introducing jitter (randomness) into the pulse timing causes a recovery from binaural adaptation and thus improves ITD sensitivity at higher pulse rates.

    Method and Results:

    These ideas have been studied by testing the ITD sensitivity of five CI listeners. The parameters' pulse rate, amount of jitter (where the minimum represents the periodic condition), and ITD were all varied. We showed that introducing binaurally synchronized jitter in the stimulation timing causes large improvements in ITD sensitivity at higher pulse rates (? 800 pps). Our experimental results demonstrate that a purely temporal trigger can cause recovery from binaural adaptation.

    Application:

    Applying binaurally jittered in stimulation strategies may improve several aspects of binaural hearing in bilateral recipients of CIs, including localization of sound sources and speech segregation in noise.

    Funding:

    Internal

    Publications:

    • Laback, B., and Majdak, P. (2007). Binaural jitter improves interaural time-difference sensitivity of cochlear implantees at high pulse rates, Proc Natl Acad Sci USA (PNAS) 105, 2, 814-817.
    • Laback, B., and Majdak, P. (2008). Reply to van Hoesel: Binaural jitter with cochlear implants, improved interaural time-delay sensitivity, and normal hearing, letter to Proc Natl Acad Sci USA 12, 105, 32.
    • Laback, B., and Majdak, P. (2007). Binaural stimulation in neural auditory prostheses or hearing aids, provisional US und EP patent application (submitted 20.06.07).
  • Objective:

    The sensitivity of normal hearing (NH) listeners to interaural time differences (ITD) in the envelope of high-frequency carriers is limited with respect to the envelope modulation rate. Increasing the envelope rate reduces the sensitivity, an effect that has been termed binaural adaptation (Hafter and Dye, 1983). In another study (Laback and Majdak, 2008), it was hypothesized that introducing binaural jitter may improve ITD sensitivity in bilateral cochlear implant (CI) listeners by avoiding periodicity. Indeed, the results showed large improvements at high rates (≥ 800 pps). This was interpreted as an indication for a recovery from binaural adaptation. 

    In this study, we further investigated this effect using NH subjects. We attempted to understand the underlying mechanisms by applying a well-established model of peripheral auditory processing. 

    Method and Results:

    Bandpass-filtered clicks (4 kHz) with a pulse rate of 600 pps were used at a nominal pulse rate of 600 pulses per second (pps). It was found that randomly jittering the timing of the pulses significantly increases detectability of the ITD. A second experiment was performed to observe the effect of place and rate for pulse trains. It was shown that ITD sensitivity for jittered pulse trains at 1200 pps were significantly higher than periodic pulse trains at 600 pps. Therefore, with the addition of jitter, listeners were not solely benefiting from the longest interpulse intervals and instances of reduced rate. A third experiment, using a 900 pps pulse train, confirmed the improvement in ITD sensitivity. This occurred even when random amplitude modulation, a side-effect in the case of large amounts of jitter, is ruled out. A model of peripheral auditory processing up to the brain stem (Nucleus Cochlearis) has been applied to study the mechanisms underlying the improvements in ITD sensitivity. It was found that the irregular timing of the jittered pulses increases the synchrony of firing of the cochlear nucleus. These results suggest that a recovery from binaural adaptation activated by a temporal irregularity is possibly occurring at the level of the cochlear nucleus.

    Application:

    Together with the results of Laback and Majdak (2008) on the effect of binaural jitter in CI listeners, these results suggest that the binaural adaptation effect first observed by Hafter and Dye (1983) is related to the synchrony of neural firings across auditory nerve fibers. The nerve fibers, in turn, innervate cochlear nucleus cells. At higher rates, periodic pulse trains result in little synchrony of the response to the ongoing signal. Jittering the pulse timing increases the probability of synchronous firing across AN fibers at certain instances of time. Further studies are required to determine if other aspects of binaural adaptation can also be attributed to this explanation. 

    Funding:

    Internal

    Publications:

    • Goupell, M. J., Laback, B., Majdak, P. (2009): Enhancing sensitivity to interaural time differences at high modulation rates by introducing temporal jitter, in: J. Acoust. Soc. Am. 126, 2511-2521.
    • Laback, B., and Majdak, P. (2007): Binaural jitter improves interaural time-difference sensitivity of cochlear implantees at high pulse rates, in: Proc. Natl. Acad. Sci. USA (PNAS) 105, 2, 814-817.
    • Laback, B., and Majdak, P. (2008): Reply to van Hoesel: Binaural jitter with cochlear implants, improved interaural time-delay sensitivity, and normal hearing, letter to Proc. Natl. Acad. Sci. USA 12, 105, 32.
  • Objective:

    Normal hearing (NH) listener sensitivity to interaural time differences (ITD) in the envelope of high-frequency carriers is limited with respect to the envelope modulation rate. Increasing the envelope rate reduces the sensitivity, an effect that has been termed binaural adaptation (Hafter and Dye, 1983). In other studies (Laback and Majdak, 2008; Goupell et al., 2008), it has been shown that introducing binaural jitter improves ITD sensitivity at higher rates in bilateral cochlear implant (CI) listeners as well as in NH listeners. The results were interpreted in terms of a recovery from binaural adaptation. Sensorineural hearing impairment often results in reduced ITD sensitivity (e.g. Hawkins and Wightman, 1980). The present study investigates if a similar recovery from binaural adaptation, and thus an improvement in ITD sensitivity, can be achieved in hearing impaired listeners. 

    Method and Results:

    Bandpass-filtered clicks (4 kHz) with pulse rates of 400 and 600 pulses per second (pps) are used. Different amounts of jitter (the minimum representing the periodic condition) and different ITDs are tested. Listeners with a moderate cochlear hearing loss are selected. Additional stimuli tested are bandpass-filtered noise bands at 4 kHz and low-frequency stimuli at 500 Hz (sinusoids, SAMs, noise bands  and jittered pulse trains). The levels of the stimuli are adjusted in pretests to achieve a centered auditory image at a comfortable loudness.

    Data collected so far show improvements in ITD sensitivity in some individuals but not in others.

    Application:

    The results may lead to the design of a new hearing aid processing algorithm that attempts to improve ITD sensitivity.

    Funding:

    Internal

  • This project consists of three subprojects:

    1.1 Frame & Gabor Multiplier:

    Recently Gabor Muiltipliers have been used to implement time-variant filtering as Gabor Filters.  This idea can be further generalized. To investigate the basic properties of such operators the concept of abstract, i.e. unstructured, frames is used. Such multipliers are operators, where a certain fixed mask, a so-called symbol, is applied to the coefficients of frame analysis , whereafter synthesis is done. The properties that can be found for this case can than be used for all kind of frames, for example regular and irregular Gabor frames, wavelet frames or auditory filterbanks.
     
    The basic definition of a frame multiplier follows: 
    FrameMultiplier
    As special case of such multipliers such operators for irregular Gabor system will be investigated and implemented. This corresponds to a irregular sampled Short-Time-Fourier-Transformation. As application  an STFT correpsonding to the bark scale can be examined.
    This mathematical and basic research-oriented project is important for many other projects like time-frequency-masking or system-identification.

    References:

    • O. Christensen, An Introduction To Frames And Riesz Bases, Birkhäuser Boston (2003)
    • M. Dörfler, Gabor Analysis for a Class of Signals called Music, Dissertation Univ. Wien (2002)
    • R.J. Duffin, A.C. Schaeffer, A Class of nonharmonic Fourier series, Trans.Amer.Math.Soc., vol.72, pp. 341-366 (1952)
    • H. G. Feichtinger, K. Nowak, A First Survey of Gabor Multipliers, in H. G. Feichtinger, T. Strohmer

    Dokumente:

    Kooperationen:

  • Objective:

    Standard noise mapping software use geometrical approaches to determine insertion loss for a noise barrier. These methods are not well suited for evaluating complex geometries e.g. curved noise barriers or noise barriers with multiple refracting edges. Here, we aim at deriving frequency and source- as well as receiver-position dependent adjustments using the boundary element method. Further, the effect of absorbing layers will be investigated as a function of the geometry. Results will be incorporated into a standard noise mapping software.

    Method:

    The cross-sections of different geometries are first parameterized and discretized and then evaluated using two-dimensional boundary element simulations. The BEM code was developed at our institute. Different parameter sets are evaluated in order to derive the adjustments for the specific geometries compared to a straight noise barrier. To make the simulations more realistic, a grassland impedance model is used instead of a fully reflecting half plane. Simulations will also be evaluated using measurements from actual noise barriers.

    Wirkung einer T-Wand bei 800 Hz

    Project partners:

    • TAS Schreiner (measurements)
    • Soundplan (implementation in sound mapping software)

    Funding:

    This project is funded from the VIF2011 call of the FFG (BMVIT, ASFINAG, ÖBB)

  • Objective and Methods:

    Spectral peaks and notches are important cues that normal hearing listeners use to localize sounds in the vertical planes (the front/back and up/down dimensions). This study investigates to what extent cochlear implant (CI) listeners are sensitive to spectral peaks and notches imposed upon a constant-loudness background. 

    Results:

    Listeners could always detect peaks, but not always notches. Increasing the bandwidth beyond two electrodes showed no improvement in thresholds. The high-frequency place was significantly worse than the low and middle places; although, listeners had highly-individual tendencies. Thresholds decreased with an increase in the height of the peak. Thresholds for detecting a change in the frequency of a peak or notch were approximately one electrode. Level roving significantly increased thresholds. Thus, there is currently no indication that CI listeners can perform a "true" profile analysis. Future studies will explore if adding temporal cues or roving the level in equal loudness steps, instead of equal-current steps (as in the present study), is relevant for profile analysis.

    Application:

    Data on the sensitivity to spectral peaks and notches are required to encode spectral localization cues in future CI stimulation strategies. 

    Funding:

    FWF (Austrian Science Fund): Project #P18401-B15

    Publications:

    • Goupell, M., Laback, B., Majdak, P., and Baumgartner, W. D. (2008). Current-level discrimination and spectral profile analysis in multi-channel electrical stimulation, J. Acoust. Soc. Am. 124, 3142-57.
    • Goupell, M. J., Laback, B., Majdak, P., and Baumgartner, W-D. (2007). Sensitivity to spectral peaks and notches in cochlear implant listeners, presented at Conference on Implantable Auditory Prostheses (CIAP), Lake Tahoe.
  • Objective:

    Measuring sound absorption is essential to performing acoustic measurements and experiments under controlled acoustic conditions, especially considering the acoustic influence of room boundaries.

    So-called "in-situ" methods allow measurement of the reflection and absorption coefficients under real conditions in a single measurement procedure. The method proposed captures the direct signal and reflections in one measurement. These reflections not only include the direct, interesting one, but also others from the surroundings. To separate the reflections coming from the tested surface, the influence of the direct signal and other reflections must be cancelled.

    One known separation method uses a time-windowing technique to separate the direct signal from the reflections. When the impulse response of the direct signal and reflections overlap in time, this method is no longer satisfactory. Frequency-dependent windowing is necessary to separate the different parts of the signal. However, in the wavelet domain, it is possible to observe separation of the interesting reflection.

    The objective of this project is to study how the use of wavelet multipliers could improve the efficiency of the in-situ methods in this context .

    Method:

    A demonstrator system will be built to acquire the necessary measurements for the evaluation of absorption coefficients. This demonstrator will be used to evaluate the usefulness of the new methods in a semi-anechoic room.

    A systematic numeric study will be carried out on the acquired signals, in order to manually determine the symbol of a wavelet multiplier for the extraction of the reflected signal. The best parameters for optimal separation will then be investigated. This, in combination with the use of physical models, will help design a semi-automatic method for the calculation of the optimal multiplier symbol.

    Application:

    The improved measurement method will be available for in-situ measurement of reflection and absorption coefficients

  • Objective:

    In speaker identification and speaker verification, wrong classifications can result from a high similarity between speakers that is represented in the speaker models. These similarities can be explored using the application of cluster analysis.

    Method:

    In speaker detection, every speaker is represented as a Gaussian Mixture Model (GMM). By using a dissimilarity measure for these models (e.g. cross-entropy), cluster analysis can be applied. Hierarchical agglomerative clustering methods are able to show structures in the form of a dendrogram.

    Application:

    Structures in speech corpora can be visualized and can therefore be used to select groups of highly similar or dissimilar speakers. The investigation of the structures concerning the aspect of misclassification can lead to model generation improvements.