Grosman, Iulia
[UCL]
Christodoulides, George
[UCL]
Degand, Liesbeth
[UCL]
Simon, Anne-Catherine
[UCL]
Lexical Identical Repetitions, where the speaker produces the same lexical form multiple times in a row, constitute a substantial part of speech production (e.g. small thing ; is that). Several typologies and potential function(s) have been proposed for these repetitions, depending on the research paradigm: cognitive, textual, stylistic, argumentative, conversational, interactional, sociolinguistic, etc. (Bazzanella 2011: 252). Previous linguistic studies usually focus on only one pattern of syntagmatic and/or acoustic realisation of repetition, or on one function, almost exclusively (cf. Hieke 1981; Shriberg 1995). A survey of the literature reveals several ways to model the phenomenon. Identical repetitions, along with several other types of structured disfluencies, can be described as a sequence of three contiguous regions: (reparandum) * interruption point (interregnum, including optional editing terms) (reparans) The reparandum is the part of the utterance that is repeated. The interregnum is the region between the reparandum and the repair. It may optionally include explicit editing terms, i.e. words or phrases used by the speaker to signal the correction (e.g. discourse markers). The repair is the continuation of the message that follows the disfluency, so that if the first two regions are removed the remainder is lexically fluent. The interruption point is the point between the reparandum and the interregnum: this instance in time does not necessarily coincide with the moment the speaker detected the trouble or with his intention to alter the utterance (Shriberg, 2001). On the morpho-syntactical level, there is general agreement that identical repetitions tend to occur on monosyllabic function words, and especially on articles and pronouns (e.g. Dister 2007; Candea 2000; Henry, Campione & Véronis, 2004). On the syntactic level, similar to other types of (dis)fluencies (Levelt 1983), identical repetitions present regularity in their structure, and they tend to co-occur with silent and filled pauses and editing terms. Acoustically, it has been shown that segments in the reparandum and the interregnum are lengthened relatively to the reparans, or compared to occurrences of the same units in fluent contexts (Shriberg 1999). Finally, prosodically, some studies have shown that the pitch of the onset (beginning) of the reparans is higher to the pitch at the offset (end) of the reparandum (Savova & Backenko 2003). In summary, the effects of identical repetitions have been studied on three levels: While previous studies have focused separately on syntagmatic, morpho-syntactic, or acoustic properties of identical repetition (e.g. left and right periphery, types of interruption point (Shriberg 1995), type of tokens (Candea 2000, Clark & Wasow 1998), pitch and duration effect in reparandum, editing terms and reparans), we aim here to take into account all three levels of analysis to empirically model the phenomenon of identical repetition1 across speaking styles. To this end we compiled five phonetically aligned corpora: the LOCAS-F corpus (Martin, Degand, and Simon 2014), the C-Humour corpus (Grosman 2016), the Driving Simulator Cognitive Load corpus (Christodoulides 2016), the C-Phonogenre corpus (Prsir, Goldman, and Auchlin 2014), and the Rhapsodie corpus (Lacheret et al. 2014). The compilation covers 31 speaking styles, includes a total of 276 different samples, and its total duration is 17,4 hours (186.895 tokens). Identical repetitions have been automatically detected using DisMo (Christodoulides, Avanzi, and Goldman 2014) and manually verified. Approximately, 3000 repetition sequences have been extracted. Our study pursues several goals. Firstly, we try to bring an empirical response to the question of typology, i.e. the categorisation of identical repetitions based on their syntactic structure. We have described each repetition based on: (1) the number of tokens repeated, (2) the number of repetitions of each string of tokens, (3) the presence/absence and the type of interregnum, with the (a) presence of a silent pause and/or (b) of editing term(s), and finally, (4) the left and right immediate context which may be (dis)fluent (including silent and filled pauses, discourse markers, truncations, etc.). Figure 1 gives insight into the association between factors; it also guides the selection of a set of useful features for the acoustic description of different repetition patterns. Overall, 86% of repetitions are not followed by a (silent or filled) pause, and 88% do not include an interregnum; although there is no significant association between the presence of a pause following the repetition and the presence of an interregnum (χ2 = 2.63, df = 1 p = 0.10). The second part of the analysis focuses on the relationship between repetition types and immediate contexts, on the one hand, and their prosodic properties of the other hand. With respect to duration, our results indicate that in the case of simple repetitions, those with a pause inside the interregum tend to have a relatively longer reparans (decrease in local articulation rate) compared to repetitions where the reparandum is not followed by a pause. This confirms Shriberg’s (1995) observations on English (Figure 2). With respect to intonation, we have performed mixed-effect modelling (with random effects on speaker) in order to compare the difference between the offset and onset of reparandum and reparans for each repetition pattern observed in the corpus. Different techniques are necessary to model the prosodic properties of monosyllabic and polysyllabic repetitions: while in the former we can readily compare the pitch contour of the reparandum and the reparans at the syllable level, the latter case requires comparisons across stylised pitch contours extending to the lexical level.
Bibliographic reference |
Grosman, Iulia ; Christodoulides, George ; Degand, Liesbeth ; Simon, Anne-Catherine. Prosodic Variation of Identical Repetitions as a Function of their Properties and Editing Terms. A Large-Scale Corpus Study on French Speech.International Conference on Fluency and Disfluency Across Languages and Language Varieties (Louvain-la-Neuve, Belgium, du 15/02/2017 au 17/02/2017). |
Permanent URL |
http://hdl.handle.net/2078.1/182968 |