class: center, middle, inverse, title-slide .title[ # Effect of consonants on onset F0 ] .subtitle[ ## Evidence from Kansai Japanese ] .author[ ### 张淼 | Miao Zhang ] .date[ ### 12/2/2022 ] --- class: inverse, center, middle # Introduction --- # Goals - To study the consonant-induced F0 perturbation (CF0) in Kansai Japanese. -- - To investigate to what extent CF0 in Kansai Japanese is motivated by phonetics/phonology. -- - To explore how discourse factors (focus) influence CF0. --- # Questions - How does consonantal voicing contrasts affect CF0 in the following vowel in a **restricted tonal language**? -- - Is CF0 motivated by phonetics or phonology? -- - How does the local **pitch context** affect CF0? -- - Does the effect interact with **vowel-induced F0 perturbation (VF0)**? -- - Can **focus structure** play a role in affecting F0 perturbation? --- class: inverse, center, middle # CF0 has a clear phonetic motivation --- # Consonant-induced F0 perturbation (CF0) - **Higher** F0 : **voiceless** consonants - **Lower** F0 : **voiced** consonants -- - The phonetic source of tonogenesis (Korean) and tonal split (Sinitic, Hmongic, and Vietnamese languages). -- - Aspiration also perturbs F0, but let's focus on perturbation due to voicing. --- # Why is F0 perturbed by consonant voicing? - Contraction of **cricothyroid muscle** during the closure. -- - Contracted cricothyroid muscle **stiffens** vocal folds. -- - Stiffened vocal folds raise the frequency of vibration, therefore also raise the F0. ??? The **contraction** of **cricothyroid muscle** during the consonant closure extends beyond the production of consonants into the following vowel. A carryover coarticulatory effect on the vowel. --- # Crycothyroid muscle (CT muscle) <img src="https://upload.wikimedia.org/wikipedia/commons/8/8a/Larynx_external_en.svg" width="75%" style="display: block; margin: auto;" /> ??? Vocal folds are attached at the back to the arytenoid cartilages, and at the front to the thyroid cartilage. --- # The movement of thyroid cartilage when the CT m. are (de)activated From Stevens (2000: 7): <img src="../../figs/slides/thyroid.png" width="65%" style="display: block; margin: auto;" /> ??? When the CT muscles is contracted, they tilts the thyroid forward and downward. This movement stretches and tenses the vocal folds. --- # The activation of CT muscle in voiceless consonants (Löfqvist et al., 1989) <img src="../../figs/slides/lofqvist.png" width="90%" style="display: block; margin: auto;" /> ??? The activation of CT muscles are measured in microvolts. The activation level is apparently higher in the production of voiceless obstruents. --- # A secondary cue to consonantal voicing Ohde (1984): <img src="../../figs/slides/ohde1984.png" width="70%" style="display: block; margin: auto;" /> -- - However, it is debated in the literature whether CF0 effect can be considered merely phonetic. ??? The perturbed F0 (henceforth "CF0") often serves as a **secondary cue** to the phonetic implementation of voicing in addition to **VOT**. But previous phonetic studies have shown that the phenomena is not as simple as it seems to be. --- class: inverse, center, middle # CF0 might not be phonetically determined --- # CF0 as phonetic knowledge - Phonetic knowledge: speech patterns that "are not determined entirely by their phonological specification for distinctive features or by constraints imposed by the speech production or perception apparatus" (Diehl & Kingston, 1994; Kingston, 2007). -- - CF0 effect may still be observed even when the phonetic context does not facilitate the perturbation. -- - Caisse (1982): In English, F0 is lower in vowels next to initial as well as intervocalic [+voice] stops, regardless of whether voicing is present during the closure in English. --- # CF0 may be motivated by phonology Dmitrieva et al. (2015) reported that in English: .pull-left[ A co-dependency between [+voice] and lower F0. <img src="../../figs/slides/f0_voicing.png" width="73%" style="display: block; margin: auto;" /> ] .pull-right[ But no co-dependency between short positive VOT and lower F0. <img src="../../figs/slides/vot_f0.png" width="70%" style="display: block; margin: auto;" /> ] -- **Lead voicing** (negative VOT) and **short lag** (short positive VOT) are subphonemic in English, while phonemic in Spanish, hence the differences above. ??? CF0 is partially independent from the glottal articulation. --- # Larger CF0 when devoicing occurs in Tokyo Japanese (Gao & Arai, 2019) Devoicing is more frequent in word-initial (WI) positions than in word-medial (WM) positions in Tokyo Japanese. <img src="../../figs/slides/tokyo.png" width="80%" style="display: block; margin: auto;" /> CF0 might be an accommodation to preserve the contrast when its phonetic identity is hindered by lack of VOT contrast. --- # CF0 can be enhanced under focus Shanghainese (Chen, 2011): CF0 is larger when it is under focus <img src="../../figs/slides/shanghai.png" width="90%" style="display: block; margin: auto;" /> -- Being able to be enhanced under focus suggests that CF0 is not merely automatic but under speaker's control. --- class: middle ## Is CF0 in Kansai Japanese motivated by phonology or by phonetics? -- ## Is the effect enhanceable under focus? --- class: inverse, center, middle # CF0 interacts with the local pitch context --- # Larger CF0 in high pitch context - CF0 is suppressed in low pitch context, while larger in high pitch context (Kohler 1982; Kingston, 2007; Hanson 2009; Ladd & Kirby, 2016; Gao & Arai 2019). -- - A conflict between the gesture of devoicing and low pitch production: + **slack** vocal folds for low pitch + **stiff** vocal folds for devoicing -- - As large CF0 is naturally attenuated in low pitch context, when it is observed, it may indicate that the speaker is trying to enhance the contrast by increasing the CF0 in low pitch context. --- class: middle ## How might CF0 in Kansai Japanese interact with local pitch accent? --- class: inverse, center, middle # CF0 is in general larger in tone languages --- # CF0 effect size as a function of the functional load of pitch? - Higher functional load of pitch in tone languages. -- - Pitch is exploited to signal lexical tones, which minimizes the magnitude of CF0 to avoid confusion among tones. -- - However, as a restricted tonal language (Hyman 2006), Tokyo Japanese displays rather large CF0 effect. ??? The functional load of pitch is higher in tone languages than in non-tonal languages. --- # Tokyo Japanese displays a large effect of CF0 (Gao & Arai, 2019) <img src="../../figs/slides/gao_arai.png" width="50%" style="display: block; margin: auto;" /> ??? The onset F0 following a voiced consonant in high pitch context is as high as the F0 following a voiceless consonant in low pitch context. The contrast between lexical pitch accent is preserved in the latter half of the vowel. --- # Kansai Japanese has a more complex tonal system .pull-left[ **Tokyo Japanese** only lexically specifies the location of the H tone. Else are derived phrasal tones (東京式). - hási-ga (**H**-L-L) - hasí-ga (L-**H**-L) - hasi-ga (L-H-H) ] -- .pull-right[ **Kansai Japanese** (as spoken in Osaka) specifies the initial tone, the location of the lexical tone, and the height of the lexical tone (京阪式). - káze-ga (*H*-H-H) - 'káwa-ga (***H***-L-L) - ìto-ga (*L*-L-H) - hà'ru-ga (*L*-**H**-L) ] -- Different from Tokyo Japanese, the initial H/L in Kansai Japanese is lexically specified rather than post-lexical. ??? (**Bold** letters indicate the location of the nuclear pitch accent, or アクセントの下り目の位置. *Italic* letters indicate the lexically specified tones.) --- class: middle ## Might the magnitude of CF0 in Kansai Japanese be restricted compared to that in Tokyo Japanese? --- class: inverse, center, middle # Vowels also perturb F0 --- # Vowel-induced F0 perturbation (VF0) - High vowels = high F0; low vowels = low F0 (Ohala, 1978; Whalen & Levitt, 1995). + Tongue pull. + Formant attraction. -- - A dual articulatory mechanism of VF0 concerning tongue and jaw movement in vowel production (Chen et al., 2021): + Tongue-pull: high F0 in non-*low* vowels, + Jaw-push: low F0 in non-*high* vowels. ??? + Tongue pull hypothesis: the raising of the raises the larynx hence raising shortening the length of vocal tract and raising the F0. + Formant attraction: high vowels are produced with low F1, dragging the F0 toward F1. + Tongue-pull contributes to raised F0 in non-low vowels, + Jaw-push contributes to lowered F0 in non-high vowels. --- class: middle ## Does CF0 interact with VF0? --- # A summary of background - Consonant voicing perturbs the F0 in the following vowel. -- - The effect is both phonetically and phonologically motivated. -- - Tone languages exhibit less perturbation, but Tokyo Japanese display large CF0 that persist through out the following vowel -- - Low pitch context attenuates CF0 effect. -- - Vowels also perturb F0. How do CF0 and VF0 interact with each other? --- class: middle, center, inverse # Experiment --- # Speaker and speech materials - Participants: 5 native Kansai Japanese speakers (4 females, 1 male) from Osaka and Kobe living in Buffalo, NY, USA. -- - Speech material: + Nonce disyllabic words with the template /CVma/. + C: /p, b, n/; V: /i, a/. + Three focus conditions: broad, narrow, and contrastive focus. + Three word accent conditions: HH, HL, LH. + Carrier sentence: 「今から[______]食べるねん。」 --- # Elicitation of word accent and focus structure - Target word accent is indicated by a real word with the same accent type following the target word in a parentheses. + E.g., 「今からビマ(雨)食べるねん。」 + Indicating words: HL--犬, HH--顔, LH--雨 -- - The questions to elicit different focus structure: + Broad focus: 今から何するん? + Narrow focus: 今から何食べるん? + Contrastive focus: 今からグラタン食べる? --- # Stimuli presentation - Trials are repeated for 7 times and randomized for each participant. -- - Each repetition includes 54 target trials and 20 filler trials. -- - Fillers were created by using the same carrier sentence but different nouns of food. -- - The stimuli are presented to the participant on a screen with `OpenSesame`. -- - Participants were instructed to listen to the audio of the question first and then read aloud the target sentence as if responding to the question. -- - Each participant was given 15-20min of training time to get used to the format of the experiment and the target word accent type. --- # Recording - The recording took place in the sound booth of the Department of Linguistics, University at Buffalo. -- - As the recording took place before the pandemic of COVID-19, no prevention needed to be taken to protect the participants. -- - Each recording session took approximately 50 minutes. --- # Data labeling and processing - The onset and the offset of the consonant VOT and the following vowels are labeled in `PRAAT`. -- - Duration of both VOT and vowel was extracted. -- - 5 F0 values were extracted from 5 equidistant intervals from the vowel. -- - F0s were z-score normalized for each participant. --- class: inverse, center, middle # Results: VOT --- # The distribution of VOT of /p/ <img src="UTokyo_221202_files/figure-html/vot dist p-1.png" style="display: block; margin: auto;" /> ??? - For voiceless bilabial stop /p/, the VOT looks in general longer in when the consonant occurs in low pitch context but shorter in high pitch context. (LH: 30ms, HL: 20ms, HH: 18) - This is probably due to when the cavity is - The VOT is also longer when /p/ is preceding /i/ than /a/. When it's preceding /i/, the average VOT is 26ms, while when it's preceding /a/, the average VOT is 17ms. --- # The distribution of VOT of /b/ <img src="UTokyo_221202_files/figure-html/vot dist b-1.png" style="display: block; margin: auto;" /> ??? - VOT of voiced bilabial stop /b/ doesn't change too much across different pitch conditions (averaged at -74ms). - The Vowel didn't have much effect on VOT either except in HL words, wherein /b/ preceding /a/ has a longer negative VOT. --- # Statistical analysis of VOT - `VOT ~ Accent*Onset*Focus*Vowel+(1+Onset+Vowel|Speaker)` + Accent: HH, HL, LH + Focus: Broad, Narrow, Contrastive + Onset: /p/, /b/ + Vowel: /i/, /a/ -- - Significant main effect: `Accent` (p < .005), `Onset` (p < .005) -- - Significant interactions involving `Onest`: `Accent:Onset` (p < .005), `Onset:Vowel` (p < .005), `Accent:Onset:Vowel` (p < .001). ??? - Focus and Vowel did not have a main effect. - Let's look at this three way interaction between Accent, Onset, and Vowel. --- # Estimated marginal means of VOT <img src="UTokyo_221202_files/figure-html/vot emm-1.png" style="display: block; margin: auto;" /> --- # Findings: VOT - High vowel /i/ lengthens the VOT in voiceless consonants, but shortens the VOT in voiced consonants. + This is due to a relatively **smaller volume of intraoral cavity**. + When producing voiceless sound, the intraoral pressure is **too high** for the **vocal folds to adduct**. + When producing voiced sound, the intraoral pressure is **too high** to maintain a **cross-glottal pressure drop**. -- - Low pitch lengthens the positive VOT but does not affect the negative VOT. + The vocal folds have to be **slackened** to make the adjustment to vibrate at a lower frequency to produce low pitch target. + It is **harder for voice onset** with slackened vocal folds, therefore voice onset is later as compared to non-low pitch. + Voice onset **already started during the closure**, therefore low pitch does not affect negative VOT that much --- # Producing lower pitch also slackens vocal folds Low pitch production also involves the contraction of: - Sternohyroid muscles (green) - Sternothyroid muscles (yellow) to lower the larynx. <img src="../../figs/slides/extrinsics.jpeg" width="65%" style="display: block; margin: auto;" /> --- # While lowering the larynx The cricoid cartilage tilts forward and shortens the vocal folds. <img src="../../figs/slides/cricoid.png" width="65%" style="display: block; margin: auto;" /> This movement also slackens the vocal folds an decreases the tension of the vocal folds, making voicing onset harder in low pitch context, therefore lengthening the VOT of the voiceless consonants. --- class: inverse, center, middle # Results: F0 --- # Average F0 contours following /p, b/ <img src="UTokyo_221202_files/figure-html/p b-1.png" style="display: block; margin: auto;" /> ??? - A very large F0 perturbation between voiceless consonants (/p/) and voiced consonants (/b/). - The difference can be as large as one standard deviation in high pitch contexts. - The difference in F0 due to perturbation seems to decrease over time in the vowel but does not disappear at the vowel offset. --- class: middle # But is it different following /b, n/? --- # Average F0 contours following /b, n/ <img src="UTokyo_221202_files/figure-html/b n-1.png" style="display: block; margin: auto;" /> ??? - It looks in high pitch context, /b, n/ do not have much of an influence on the F0. In HL words, F0 of /a/ following /b/ is slightly lower than other conditions. - In the low pitch context though, the F0 following /n/ is lower than F0 following /b/. --- # Again, F0 does not vary across focus structures <img src="UTokyo_221202_files/figure-html/f0 focus-1.png" style="display: block; margin: auto;" /> ??? - The F0 is only slightly higher in LH words in broad focus, while slightly lower in HH words in contrastive focus. --- # To quantitatively examine the effects on F0 - Data from time points 1, 3, 5 were submitted to linear mixed effect models. - Model specification (the maximally complex models that converged and did not result in singular fits): ```r # Time point 1: F0 ~ Onset*Vowel*Accent*Focus+(1+Accent|Speaker) # Time point 3: F0 ~ Onset*Vowel*Accent*Focus+(1+Vowel|Speaker) # Time point 5: F0 ~ Onset*Vowel*Accent*Focus+(1+Accent|Speaker) ``` --- # Time point 1: main effects - `Onset`, `Accent`, `Fodus`, `Vowel` all had a main effect on F0. + /p/ `\(>\)` /b/ `\(>\)` /n/ + /i/ `\(>\)` /a/ + HH `\(\approx\)` HL `\(>\)` LH + broad `\(>\)` contrastive - `Onset:Vowel` and `Onset:Accent` were also significant. --- # Time point 1: interaction - `Onset:Vowel` <img src="UTokyo_221202_files/figure-html/1 onset ~ accent-1.png" style="display: block; margin: auto;" /> --- # Time point 1: interaction - `Onset:Vowel` <img src="UTokyo_221202_files/figure-html/1 onset ~ vowel-1.png" style="display: block; margin: auto;" /> --- # Time point 3: main effect - `Onset`, `Accent`, `Vowel` all had a main effect on F0. + /p/ `\(>\)` /b/ `\(\approx\)` /n/ + /i/ `\(>\)` /a/ + HL `\(>\)` HH `\(>\)` LH - Interaction of `Onset:Accent` was significant --- # Time point 3: interaction <img src="UTokyo_221202_files/figure-html/3 onset ~ accent-1.png" style="display: block; margin: auto;" /> --- # Time point 5: main effect - `Onset`, `Accent`, `Vowel` all had a main effect on F0. + /p/ `\(>\)` /b/ `\(\approx\)` /n/ + /i/ `\(>\)` /a/ + HL `\(>\)` HH `\(>\)` LH - No interactions that involved `Onset` were significant at time point 5. --- # Summary: findings about F0 - The F0 perturbation due to consonant voicing was significant across the entire following vowel. -- - Vowel height can interact with voicing-induced F0 perturbation: + CF0 effect is larger when the following vowel is /i/. -- - Pitch can also interact with the perturbation: + CF0 is larger in high pitch context. + However, the CF0 is not attenuated in low pitch context. -- - The interactions were more significant in the beginning of the vowel than in the end of the vowel. --- class: inverse, center, middle # Discussions --- # CF0 effect is large in Kansai Japanese - Comparing the effect found in Kansai Japanese in this study to that found in Gao & Arai (2019). -- - This indicates that the effect is not minimized by the functional load of pitch although pitch is utilized to distinguish lexical items in both languages. -- - In fact, the effect is even larger compared to some non-tonal languages. --- # Kansai Japanese exhibits CF0 effect comparable to those non-tonal languages Kirby (2018) found direct evidence that non-tonal languages should exhibit more CF0 than tonal languages through directly comparing CF0 data from both tonal and non-tonal languages in the same linguistic area: - Non-tonal language: Khmer <img src="../../figs/slides/khmer.png" width="50%" style="display: block; margin: auto;" /> --- # Kansai Japanese exhibits CF0 effect comparable to those non-tonal languages Kirby (2018) found direct evidence that non-tonal languages should exhibit more CF0 than tonal languages through directly comparing CF0 data from both tonal and non-tonal languages in the same linguistic area, South-Eastern Asia. - Tonal language: Central Thai <img src="../../figs/slides/thai.png" width="65%" style="display: block; margin: auto;" /> --- # CF0 in Kansai Japanese is motivated both by phonetics and phonology - The basic patterns that voiceless consonants raises the F0 and that the difference between /b/ and /n/ was minimal both indicate CF0 in Kansai Japanese certainly has a phonetic motivation. -- - However, that the perturbation is rather significant even in low pitch context looks not purely phonetics. + Natural tendency for voicing-induced F0 perturbation is to reduce its magnitude in low pitch context. -- + It is hard to interpret the significant results at the end of the vowel as a natural carryover effect of consonant devoicing gesture that involves the contraction of cricothyroid muscles. -- - The F0 is large even though the VOT is salient enough to distinguish voicing contrast. -- - **Speakers** might have a **control** on their pitch production to enhance the voicing contrast in Kansai Japanese (-> "Phonetic knowledge" (Kingston & Diehl, 1994)). # The CF0 is larger in high pitch contexts - CF0 is larger in high pitch accent and high vowels. Both exhibit higher pitch. -- - The default setting of vocal folds is either **[+stiff] for [+voice]** or **[+slack] for [-voice]**. -- - If the default laryngeal setting of voicing contrast in a language meets with a congruent pitch context, the perturbation effect will exhibit more difference in such context (Kohler 1982, 1984). -- - My data suggests that the default in Japanese is **[+stiff] for [+voice]**. --- # Surprisingly, influence from focus structure was minimal - The CF0 did not differ for different focus structures, although focus structure did affect the average F0 in the beginning of the vowel. -- - This might suggest that CF0 in Kansai Japanese does not receive enhancement due to discourse level factors. -- - **But, there is a caveat!** + The stimuli only consist of nonce words that did not mean much thing to the speakers. + The participants were also computing the target pitch target for the nonce words. -- - The nature of the stimuli used for the experiment may have minimized the speakers' ability to make adjustment to meet the discourse level requirements. --- # Possible future develpment - How do speakers perceptually respond to the CF0 in vowel production? + When the VOT difference vanished, will they able to tell the consonant simply by hearing the difference in the pitch? -- - A bigger question: was it consonant-induced allotony? May Kansai Japanese's pitch accent system also split into a more complex system? + Many languages have consonant series that can only occur with high or low tones, such as in Shanghainese and Zulu (Chen 2011b), or in Korean (Kang 2014). -- - The result cannot be directly interpreted as a typology about word-prosody and the size of CF0 effect. + It might be that languages in Japan are special in this regard. + Or, pitch in pitch accent system has a much less functional load than in complex tonal systems. --- class: inverse, middle, center # Conclusion --- # Conslusion - Consonant-induced F0 perturbation is observed in Kansai Japanese. The magnitude and longevity are large and are comparable to those reported in tonal languages. - The effect is slightly reduced in low pitch context, but is not attenuated. - The effect may be both phonetically motivated and controlled by the speaker to enhance the contrast. - CF0 and VF0 interacted only to a minimal extent. --- class: center, middle # Thanks for listening!