Acoustic Phonetics

  • Introduction:

    The ability of listeners to discriminate literal meanings from figurative language, affective language, or rhetorical devices such as irony is crucial for a successful social interaction. This discriminative ability might be reduced in listeners supplied with cochlear implants (CIs), widely used auditory prostheses that restore auditory perception in the deaf or hard-of-hearing. Irony is acoustically characterised by especially a lower fundamental frequency (F0), a lower intensity and a longer duration in comparison to literal utterances. In auditory perception experiments, listeners mainly rely on F0 and intensity values to distinguish between context-free ironic and literal utterances. As CI listeners have great difficulties in F0 perception, the use of frequency information for the detection of irony is impaired. However, irony is often additionally conveyed by characteristic facial expressions.

    Objective:

    The aim of the project is two-fold: The first (“Production”) part of the project will study the role of paraverbal cues in verbal irony of Standard Austrian German (SAG) speakers under well-controlled experimental conditions without acoustic context information. The second (“Perception”) part will investigate the performance in recognizing irony in a normal-hearing control group and a group of CI listeners.

    Method:

    Recordings of speakers of SAG will be conducted. During the recording session, the participants will be presented with scenarios that evoke either a literal or an ironic utterance. The response utterances will be audio- and video-recorded. Subsequently, the thus obtained context-free stimuli will be presented in a discrimination test to normal-hearing and to postlingually deafened CI listeners in three modes: auditory only, auditory+visual, visual only.

    Application:

    The results will not only provide information on irony production in SAG and on multimodal irony perception and processing, but will, most importantly, identify the cues that need to be improved in cochlear implants in order to allow CI listeners full participation in daily life.

  • Objective:

    Speaker models generated from training recordings of different speakers should differentiate between speakers. These models are estimated using feature vectors that are based on acoustic observations. So, the feature vectors should themselves show a high degree of inter-speaker variability and a low degree of intra-speaker variability.

    Method:

    Cepstral coefficients of transformed short-time spectra (e.g. Mel-Frequency Cepstral Coefficients - MFCC) are experimentally developed features that are widely used in the domain of automatic speech and speaker detection. Because of the manifold possibilities of parameters for the feature extraction process and the lack of theoretically motivated explanations for the determination of the last-mentioned, only a stepwise investigation of the extraction process can lead to stable acoustic features.

    Application:

    Optimized acoustic features for the representation of speakers enables the improvement of automatic speaker identification and verification. Additionally, the development of methods for forensic investigation of speakers (manually and automatically) is supported.

  • Objective:  

    The aim of this project is to conduct basic research on the audio-visual speech synthesis of Austrian dialects. The project extends our previous work on

    Method:

    10 speakers (5 male and 5 female) will be recorded for each dialect. The recordings comprise spontaneous speech, read speech and naming tasks, eliciting substantial phonemic distinctions and phonotactics. Consequently, a detailed acoustic-phonetic and phonological analysis will be performed for each dialect. Based on the acoustic-phonetic and phonological data analysis, 600 phonetically balanced sentences will be created and recorded with 4 speakers (2 male, 2 female) for each dialect. In these recordings the acoustic and the visual signal, resulting from the same speech production process, will be recorded jointly to account for the multimodal nature of human speech. The recorded material will serve as a basis for the development, training, and testing of speech synthesizers at the Telecommunications Research Center.

    Funding:

    FWF (Wissenschaftsfonds): 2011-2013

    Project Manager: Michael Pucher, Telecommunications Research Center, Vienna

    Project Partner: Sylvia Moosmüller, Acoustics Research Institute, Austria Academy of Sciences, Vienna

  • Objective:

    The generation of speaker models is based on acoustic features obtained from speech corpora. From a closed set of speakers, the target speaker has to be identified in an unsupervised identification task.

    Method:

    Training and comparison recordings exist for every speaker. The training set is used to generate parametric speaker models (Gaussian Mixture Models – GMMs), while the test set is needed for the comparisons of all test recordings to all models. The model with the highest similarity (maximum likelihood) is chosen as the target speaker. The efficiency of the identification task is measured as the identification rate (i.e. the number of correctly chosen target speakers).

    Application:

    Aside from biometric commercial applications, the forensic domain is another important field where speaker identification is used. Because speaker identification is a closed-set classification task, it is useful in cases where a target speaker has to be selected from a set of known speakers (e.g. in the case of hidden observations).

  • Objective:

    The offender recording (a speaker recorded at the scene of a crime) is verified by determining the similarity of the offender recording's typicality to a recording of a suspect.

    Method:

    A universal background model (UBM) is generated via the training of a parametric Gaussian Mixture Model (GMM) that reflects the distribution of feature vectors in a reference population. Every comparison recording is used to derive a GMM from the UBM by adaptation of the model parameters. Similarity is measured through computation of the likelihoods of the offender recordings in the GMM while typicality is measured by computation of the likelihoods of the offender recordings in the UBM. The verification is expressed as the likelihood ratio of these likelihood values.

    Application:

    While fully unsupervised automatic verification is performed with a binary decision using a likelihood ratio threshold and is used in biometric commercial applications, the usage of the likelihood ratio as an expression of the strength of the evidence in forensic speaker verification has become an important issue.

  • BE-SyMPHONic: French-Austrian joint project granted by ANR and FWF

    Principal investigators: Basilio Calderone, Wolfgang U. Dressler
    Co-applicants: Hélène Giraudo, Sylvia Moosmüller

    Start of the project: 13th January 2014

    Introduction:

    Language sounds are realized in several different ways. Every language exploits no more than a sub-set of the sounds that the vocal tract can produce, as well as a reduced number of their possible combinations. The restrictions and the phonemic combinations allowed in the lanquage define a branch of phonology so-called phonotactics.

    Phonotactics refers to the sequential arrangement of phonemic segments in morphemes, syllables, and words and underlies a wide range of phonological issues, from acceptability judgements (pseudowords like <poiture>in French or <Traus>in German are phonotactically plausible) to syllable processes (the syllabic structure in a given language is based on the phonotactic permission in that language) and the nature and length of possible consonant clusters (that may be seen as intrinsically marked structures with respect to the basic CV template).

    Objective:

    Exploring the psycho-computational representation of the phonotactics in French and German is the aim of this research project.

    In particular, our researh will focus on the interplay between phonotactics and word structure in French and German, and investigate the behavioural and computational representations of phonotactic vs. morphonotactic clusters.

    As a matter of fact, the basic hypothesis underlying this research project ist that there exist different cognitive and computational representations for the same consonant cluster according to its phonotactic setting. In particular, the occurence of a cluster across a morpheme boundary (morphonotactic cluster) is considered as particularly interesting.

    Method:

    Our research will focus on the interplay between phonotactis and morphology and investigate the behavioural and computational representations of consonant clusters according to whether they are: a) exclusively phonotactic clusters, i.e. the consonant cluster occurs only without morpheme boundaries (e.g.Steinin German); b) exclusively morphonotactic clusters, i.e. the consonant cluster occurs only beyond morpheme boundaries (e.g.lach+st), c) both are true with one of the two being more or less dominant (e.g. dominantlob+stvs.Obst)[1]. Thus we test the existence of different ‘cognitive and computational representations’ and processes for the same and for similar consonant clusters according to their appartenance to a) or b) or c).

    The central hypothesis which we test is that speakers not only reactively exploit the potential boundary signaling function of clusters that result from morphological operations, but take active measures to maintain or even enhance this functionality, for example by treating morphologically produced clusters differently than morpheme internal clusters in production or language acquisition. We call this hypothesis, the ‘Strong Morphonotactic Hypothesis’ (henceforth: SMH) (Dressler & Dziubalska-Koɫaczyk 2006, Dressler, Dziubalska-Koɫaczyk & Pestal 2010).

    In particular, we suppose that sequences of phonemes exhibiting morpheme boundaries (the ‘morphonotactic clusters’) should provide the speakers with functional evidence about the morphological operation occurred in that sequence; such evidence should be absent in the case of a sequence of phonemes without morpheme boundaries (the ‘phonotactic clusters’).

    Hence our idea is to investigate the psycho-computational mechanisms underlying the phonotactic-morphonotactic distinction by approaching the problem from two angles simultaneously: (a) psycholinguistic experimental study of language acquisition and production and (b) language computational modelling.

    We aim therefore at providing, on one hand, the psycholinguistic and behavioural support to the hypothesis that morphologically produced clusters are treated differently than morpheme internal clusters in French and German; on the other, we will focus on the distributional and statistical properties of the language in order to verify whether such difference in clusters’ treatment can be inductively modelled by appealing to distributional regularities of the language.

    The competences of the two research teams overlap and complement each other. The French team will lead in modelling, computational simulation and psycholinguistic experiments, the Austrian team in first language acquisition, phonetic production and microdiachronic change. These synergies are expected to enrich each group in innovative ways.


    [1] An equivalent example for French language is given by a)prise(/priz/ ‘grip’, exclusively phonotactic cluster), b)affiche+ rai(/afiʃʁɛ/ ‘I (will) post’, exclusively morphonotactic cluster) and c)navigue+ rai(/naviɡʁɛ/ ‘I (will) sail’) vs.engrais(/ãɡʁɛ/ ‘fertilizer’), the both conditions are true with morphonotactic condition as dominant.

  • Derzeit stellen SprecherInnen aus Deutschland die größte AusländerInnengruppe in Österreich und insbesondere in Wien dar. In diesem vom Kulturamt der Stadt Wien geförderten Projekt wird untersucht, ob und inwieweit aufgrund des Kontakts mit der deutschen Standardaussprache diese einen Einfluss auf die österreichische Standardaussprache ausübt und umgekehrt. Es werden akustische Aufnahmen von mehreren SprecherInnengruppen mit unterschiedlich intensivem Kontakt zu deutschen SprecherInnen durchgeführt

  • Projektleitung: Michael Pucher

    Beginn des Projekts: 1. Februar 2019

    Projektbeschreibung:

    Um den aktuellen Zustand einer Sprache zu erheben, soll bekanntlich der Sprachgebrauch eines alten, ländlichen, nicht mobilen Mannes analysiert werden. Für Entwicklungstendenzen einer Varietät sollte man jedoch die Sprache einer jungen und gebildeten Frau im urbanen Bereich untersuchen. Der Sprachgebrauch von jungen Frauen stellt ein besonders interessantes Forschungsfeld dar: Sie gelten als Initiatoren und Treibkräfte linguistischer Neuheiten einer Sprache, lautlich wie lexikal, die sich von Großstädten aus in den weiteren Sprachraum verbreiten können. Ebenso wird angenommen, dass aufgeschlossene junge Frauen linguistische Innovationen rascher übernehmen als ihre männlichen Peers. Sie verleiben sich eine neue Art zu sprechen schneller ein und geben diese an ihre späteren Kinder weiter. Frauen tendieren auch dazu, sprachliche Merkmale als social identifier zu verwenden, um sich der gleichen Peergroup zugehörig zu zeigen und können dadurch zu einem Sprachwandel beitragen.

    Die Stadt Wien hat sich in den vergangenen 30 Jahren stark verändert; so ist die Bevölkerung um 15% gestiegen und mit ihr auch die Anzahl der gesprochenen Sprachen. Laut einer Erhebung der Arbeiterkammer werden in Wien ca. 100 verschiedene Sprachen verwendet und man kann Wien nicht absprechen, weiterhin als ein Schmelztiegel verschiedenster Sprachen und Kulturen in Mitteleuropa zu gelten. Dass sich diese gesellschaftlichen bzw. gesellschaftspolitischen Veränderungen nicht nur im lexikalischen Sprachgebrauch der WienerInnen widerspiegeln, sondern ebenso in ihrer physiologischen Stimme zum Ausdruck kommen, soll hier den Ausgangspunkt der Studie darstellen.

    In dieser Untersuchung wird die Stimme als der physiologische und im Vokaltrakt modulierter Schall zur Lautäußerungen des Menschen gesehen. Die Stimme kann abgesehen davon auch als Ort des verkörperlichten Herz der gesprochenen Sprache gelten, die den Körper durch Indexikalität im sozialen Raum verankert. Als Vehikel der persönlichen Identität kann die Stimme nicht nur soziokulturelle, sondern auch gesellschaftspolitische Merkmale (bspw. „Frauen in Führungspositionen haben eine tiefere Stimme“) widerspiegeln. Hier übernimmt die Soziophonetik eine tragende Rolle, denn sie stellt ein wichtiges Instrument dar, das es ermöglicht, den sozialen Raum und seine gesellschaftsrelevanten Diskurse mit dem Individuum zu verknüpfen.

    Studien aus dem angloamerikanischen Raum wie legen nahe, dass sich die Stimme der jungen Frau in einem Wandel befindet. Das soziophonetische Stimmphänomen Vocal Fry hat sich inzwischen im angloamerikanischen Raum zum prominenten Sprachmerkmal junger, gebildeter und urbanen Frauen entwickelt.

    Basierend auf zwei Korpora soll eine Longitudinalstudie entstehen, die nachskizziert, inwiefern sich die Stimme der jungen Wienerin geändert hat. Soziophonetische Studien zu Frauenstimmen gibt es in Österreich nicht, vor allem in Hinsicht auf die angestrebte Qualität der Studie. Durch ihren longitudinalen Charakter kann sie aufzeigen, in wie weit das gesellschaftliche Geschehen Einfluss auf die Stimme der Frau ausübt.

    Darüber hinaus bietet diese Studie eine einmalige Gelegenheit, eine Momentaufnahme der Wienerin und ihrer Stimme zu erhalten und sie in einen historischen Kontext zu setzen.

     

    Informationen zur Teilnahme finden Sie hier!

  • Forensic Speech Analysis is currently being developed using two main methodologies:

    • Automatic methods, applying digital signal processing algorithms and Bayes Statistics.
    • Acoustic Phonetics and Phonology based on acoustic measurements of speech parameters, such as formant frequencies and fundamental frequency of speech segments. 

    The Institute investigates both approaches in the framework of the FSAAWG (Forensic Speech and Audio Working Group) of ENFSI (the European Network of Forensic Science Institutes).

  • Implications for pathological speech

    Coordinated Project 2016-17 Scuola Normale Superiore (SNS), Pisa – Acoustic Research Institute (ARI), Austrian Academy of Science, Vienna
    PIs: Chiara Celata (SNS), Sylvia Moosmueller (ARI)
    Research personnel: Chiara Meluzzi (SNS), Bettina Hobel (ARI)

    Short Description

    The project aims at modeling the impact of speech gesture coordination on the rhythmical properties of languages.

    Speech gestural structures are sets of gestures and a specification of how they are temporally and spatially coordinated with respect to one another. Gestural anticipations, posticipations and overlap are the ingredients of coarticulation, i.e., the coordinatory activity of speech movements that allows adjacent vowels and consonants to be produced simultaneously, thus resulting into one smooth whole.

    Rhythm is the systematic patterning of timing, accent, and grouping in sequences of events and encompasses both speech and music domains. We only become aware of how important it is in verbal communication when we listen to non-fluent speech. For example, deaf people with impaired or absent auditory feedback can be taught, after cochlear implantation and logopedic rehabilitation, to develop an “auditory” map for speech processing and imitation, but native-like patterns of gestural and rhythmical coordination are much more difficult to achieve.

    Both gestural coordination and rhythm thus contribute to the way fluent speech is programmed, produced, and even perceived.

    However, we still miss a global understanding of how the two dimensions of gestural coordination and speech rhythm interact in natural languages.

    Indeed, the gestural and the rhythmical approaches sometimes make different predictions. For example, we do not know whether the consonants composing heterosyllabic clusters are articulatorily independent from one another and are timed with respect to different vocalic nuclei, as some theoretical frameworks in the domain of gestural coordination would predict, or whether they are rather globally timed with the preceding vocalic nucleus, especially if it is stressed, as some proposals in the domain of speech rhythm assume. Also, we do not know if cross-linguistic differences in how heterosyllabic clusters are articulatorily coordinated to vocalic nuclei reflect or are reflected by cross-linguistic differences in the languages’ rhythmical properties.

    This project thus tries to reconcile the gestural and the rhythmical perspective into a unified research framework devoted to uncovering how inter-segmental coordination influences, and is influenced by, the rhythmical properties of supra-segmental entities.

    To that aim, we develop a series of cross-linguistic experiments on Italian and Standard Austrian German to clarify some critical aspects of speech organization in the two languages and to establish a link between language-specific phonotactic constraints and the temporal and spatial properties of segments’ production.

    The experiments, based on a reading task, include acoustic analyses for the identification of the temporal patterns and articulatory (ultrasound tongue imaging, UTI) analyses for the investigation of gestural coordination.

    In addition, it is the purposes of the project to set the stage for an analysis of how the speech of cochlear implanted speakers differs from normal speech with respect to gestural coordination and rhythmic patterns. Spontaneous conversations will be recorded of both Italian and Standard Austrian German speakers. The target of the acoustic analyses will be the identification of the areas of most prominent difficulty concerning both the coarticulatory and the temporal aspects of spontaneous speech produced by CI-speakers.

  • Objective:

    The aim of this study is to investigate the phonetics of second language acquisition and first language attrition, based on the acoustic and articulatory lateral realizations of Bosnian migrants living in Vienna. Bosnian has two lateral phonemes (a palatalized and an alveolar/velarized one), whereas Standard Austrian German features only one lateral phoneme (an alveolar lateral). In the Viennese dialect however, this phoneme also has a velarized variant.

    This phonetic investigation will be conducted with respect to the influence of language contact between Bosnian and SAG, and Bosnian and the Viennese dialect, as well as concerning the influence of gender and identity construction.

    Method:

    The recordings will be conducted with female and male Bosnian speakers, aged between 20 and 35 years at the time of emigration, who came to Vienna during the Bosnian war 1992-1995. Additionally, control groups of monolingual L1 speakers of Bosnian, SAG and Vd will be recorded. All recordings will include reading tasks in order to elicit controlled speech, as well as spontaneous speech in the form of biographical interviews. The analyses will comprise quantitative and qualitative aspects. Quantitatively, the acoustic parameters formant frequencies (especially F2 and F3), duration and intensity of the laterals and their phonetic surrounding will be analyzed. Additionally, articulatory analyses will be performed using EPG and UTI data. Qualitatively, biographical information, language attitudes and social networks will be analysed in order to obtain information about speaker-specific or group-specific characteristics.

    Application:

    The results of this study are relevant to understanding the processes of sound-realization and sound-change in the domains of language contact (phonetic processes in second language acquisition and first language attrition), sociolinguistics, and the sociology of identity construction

  • Objective:

    The modeling step in speaker detection has an enormous influence on the classification task, because the quality of the model depends on the parameters chosen in this step. False classifications, false identifications, and false verifications can result from malformed speaker models. The initial model parameters have an influence on the final determined parameters of the speaker models. To obtain optimized speaker models, different initialization methods are explored.

    Method:

    Speaker models are represented as Gaussian Mixture Models (GMMs). These models are mixtures of multivariate distributions that are parameterized by the means and the co-variance matrices of the distributions and the mixture weights. The parameters are estimated by the expectation maximization algorithm (EM algorithm) which maximizes the likelihood in the model. Initial model parameters have to be selected for this algorithm. Different initial parameters can lead to a convergence of the algorithm in local maximums. The effect of different initialization methods on the identification rate is analyzed.

    Application:

    Optimized speaker models reflect the speech behavior of the speakers in an optimal way. The inter-speaker variability is maximized while the intra-speaker variability is minimized by avoidance of malformed speaker models. The usage of optimal initialization methods improves the robustness and the reliability of automatic speaker identification and verification systems.

  • Effects of the subthalamic stimulation on the characteristic of speech by parkinson patients.

  • Introduction:

    As is customary for urban varieties, the varieties of Vienna are predominantly social varieties. Education and social background form the primary factors which define the language behaviour of the speakers.

    The Viennese dialect belongs to the Middle Bavarian dialect group. Around the turn of the century, a sound change arose which monophthongized the diphthongs /aɛ/ and /ɑɔ/ to /æ:/ and /ɒ:/ repectively. This sound change was accomplished around 1950. As a result of the Viennese monophthongization, the palatal constriction location became overloaded. As early as the thirties, Kranzmayer observed what he called the "e-confusion", i.e., people stopped to discern the /e/-vowels, "Segen" (blessing) and "sehen" (to see) became homophones: [se:ŋ].

    Method:

    5 female and 5 male speakers of the Viennese dialect were asked to name pictures, to read sentences, and to speak spontaneously.

    Results:

    As a consequence of the Viennese monophthongization and the consecutive overcrowding of the palatal constriction location, speakers of the Viennese dialect developed two strategies. One group, in the sense Kranzmayer observed, neutralized /e/ and /ɛ/ to /e/. This neutralization made room for the new palatal vowel /æ/.

    The other group, however, preserved /e/ and /ɛ/, but sometimes applied the two vowels incorrectly, i.e., produced /ɛ/ instead of /e/ and the other way round. However, since no neutralization took place, the vowel /i/ is shifted to the pre-palatal constriction location. By this shift, room is created on the palatal bar for the new vowel /æ/.

    • Group I, consequently, discerns the following vowels:
    • palatal: /i:, i, e:, e, æ:/
    • velar: /u:, u/
    • uvular: /o:, o, ɔ:, ɔ/
    • pharyngeal: /ɑ:, ɑ, ɒ:/

    Group II discerns the vowels as follows:

    • pre-palatal: /i:, i/
    • palatal: /e:, e, ɛ:, ɛ, æ:/
    • velar: /u:, u/
    • uvular: /o:, o, ɔ:, ɔ/
    • pharyngeal: /ɑ:, ɑ, ɒ:/

    Lip rounding and duration is distinctive for each vowel system.

  • Objective:

    In speaker identification and speaker verification, wrong classifications can result from a high similarity between speakers that is represented in the speaker models. These similarities can be explored using the application of cluster analysis.

    Method:

    In speaker detection, every speaker is represented as a Gaussian Mixture Model (GMM). By using a dissimilarity measure for these models (e.g. cross-entropy), cluster analysis can be applied. Hierarchical agglomerative clustering methods are able to show structures in the form of a dendrogram.

    Application:

    Structures in speech corpora can be visualized and can therefore be used to select groups of highly similar or dissimilar speakers. The investigation of the structures concerning the aspect of misclassification can lead to model generation improvements.

  • Objective:

    Up to now, a thorough phonetic-acoustic and phonological description of the vowels and the vowel system of Standard Austrian German has not been provided.

    Method:

    Approximately 11,000 vowels of three female and three male speakers of Standard Austrian German have been segmented and analyzed acoustically.

    Results:

    Standard Austrian German discerns 13 vowels on five constriction locations:

    • pre-palatal for the /i/ and the /y/ vowels
    • mid-palatal for the /e/ and the /ø/ vowels
    • velar for the /u/ vowels
    • upper pharyngeal for the /o/ vowels
    • lower pharyngeal for /ɑ/

    Each vowel pair consists of a constricted and an unconstricted vowel. The front vowels (pre-palatal and mid-palatal) additionally distinguish rounded and unrounded vowels. The following articulatory features sufficiently discriminate all vowels:

    • [± constricted]
    • [± front]
    • [± prepalatal]
    • [± pharyngeal]
    • [± round]

    Contrary to general assumptions, F1 and F2 do not sufficiently discern the vowels of Standard Austrian German; F3 is necessary as well. Discriminatory ability is maintained over all speaking styles and prosodic positions.

  • Project Part 02 of the special research area German in Austria. Variation - Contact - Perception funded by FWF (FWF6002) in cooperation with the University of Salzburg

    Principal Investigators: Stephan Elspaß, Hannes Scheutz, Sylvia Moosmüller

    Start of the project: 1st of January 2016

    Project description:

    The diversity and dynamics of the various dialects in Austria are the topic of this project. Based on a new survey, different research questions will be addressed in the coming years, such as: What are the differences and changes (e.g. through processes of convergence and divergence) that can be observed within and between the Austrian dialect regions? What are the alterations in dialect change between urban and rural areas? Are there noticeable generational and gender differences with regard to dialect change? What can a comprehensive comparison of ‘real-time’ and ‘apparent-time’ analyses contribute to a general theory of language change?

    To answer these questions, speech samples from a total of 160 dialect speakers, balanced for age and gender, are collected and analysed within the first four years at 40 locations in Austria. Furthermore, samples from selected speakers will be recorded and valuated under laboratory conditions to determine phonetic peculiarities as precisely as possible. In the second survey phase complementary recordings are carried out at another 100 locations in Austria in order to analyse differences and changes between the dialect landscapes in more detail. State-of-the-art dialectometric methods will be used to arrive at probabilistic statements regarding dialect variation and change in Austria. The analyses will include all linguistic levels from phonetics to syntax and lexis. A documentation of these data will be carried out on the first visual and ‘talking’ dialect atlas of Austria.

    Project page of the project partners in Salzburg

     

  • Vowel and consonant quantity in Southern German varieties: D - A - CH project granted by DFG, FWF, SNF

    Principal investigators: Felicitas Kleber, Michael Pucher, Sylvia Moosmüller†, Stephan Schmid 

    Start of the project: 1st of June 2015

    Project description:

    Introduction:

    The Central Bavarian varieties, to which the Viennese varieties belong, seem to have changed diachronically. From the first phonetic descriptions (Pfalz 1913) to more current descriptions (Moosmüller & Brandstätter 2014) the diachronic change becomes visible on several levels of the varieties.

    In this project we focus on the (in)stability of the timing system, or more precise, the quantity relations in Vowel + Consonant sequences and compare our results with the project partners in Zurich and Munich.

    Aims:

    The aims of this project are two-fold. The first aim is to develop a typology of the Vowel + Consonant quantities in Southern German varieties (Bavarian (Munich + Vienna) and Alemannic (Zurich)) in C1V1C2V2contexts (where C2can be either fricatives or nasals or plosives) and in consonant cluster sequences with increasing initial and final consonant cluster complexity. The second aim is to investigate prosodic changes in an apparent-time study and to examine the influence of internal factors (eg. speech rate) and external factors (language attitudes) on the production of speech.

    Method:

    Recordings and analyses of 40 speakers of the Viennese varieties (balanced for age, gender, and educational background) will be conducted. During the recording sessions the speakers are asked to read and repeat sentences in two speech rates. Furthermore a subset of speakers is asked to participate in an articulatory recording with an electromagnetic articulograph (EMA). These recordings take place at our project partners’ laboratory in Munich.

    Application:

    The results will not only provide insight in the current timing system of speakers of the Viennese varieties but also enable us to draw conclusions about sound changes in progress.

     

  • Objective:

    This project describes vowel systems of several languages acoustically and compares them. The project's main interest is focused on languages with acoustically insufficient descriptions thus far, e.g. Albanian, Romanian, Ful, Mandinka, or Crioulo.

    Method:

    Selected speakers are asked to perform a reading task and to speak spontaneously. Vowels in all positions are segmented, labeled, and analyzed. Formant frequencies (F1, F2, F3) are extracted and the vowel systems are defined.

    Language specificity affects not only the number of vowels and their features, but also the extent of variability and stability of certain vowels. A given vowel of language A might be quite stable, whereas the same vowel might exert high variability in language B. In the same way, vowels might be discerned differently. For example, pre-palatal /i/ and mid-palatal /e/ are discerned by F3 in Standard Austrian German (see diagram on SAG), whereas both mid-palatal /i/ and /e/ are predominantly discerned by F2 in Modern Standard Albanian (see diagram on MSA).

    Application:

    In forensic speaker identification, thorough descriptions of the languages in question are often needed in order to conduct a thorough comparison.