Private photo of a mother talking to her infant in a home-environment.

NEWS - NEWS - NEWS

May 2016: We will present the KIDS Corpus at the 8th International Conference on Speech Prosody, which will be hosted at Boston University from Tuesday, May 31 through Friday, June 3, 2016. You can download the poster and the paper here.

May 2016: We are very happy to announce that the KIDS Corpus will soon be available on CHILDES (MacWhinney 2000) and Phon (Rose et al. 2006; Rose & MacWhinney 2014). Thanks to Brian MacWhinney and Yvan Rose for their support!


Description of KIDS

The KIDS corpus is the first prosodically annotated infant-directed speech corpus in German – a tool for formulating hypotheses and modeling acquisition processes in the prosodic domain and at the prosody-syntax interface. This multi-layered corpus consists of 524 intonation phrases (IPs) directed to infants younger than one year (196 IPs extracted from the CHILDES database; 328 IPs from our own recordings). Pitch accents (n=832) and boundary tones (n=1048) were labeled according to GToBI (Grice, Baumann & Benzmüller 2005; click here for training materials). Furthermore, we annotated the presence of unaccented syllables and pitch targets before and after the accentual syllable. Such an additional theory-neutral prosodic annotation is important as we do not know whether infants are more sensitive to the pitch movement leading to the accented syllable (a.k.a. onglides) or to the pitch movement following the accented syllables (a.k.a. offglides). The current corpus hence captures the tonal surroundings on both sides of the accented syllable. We also tagged the word-prosodic structure of all accented words (e.g., trochaic, iambic) and the syntactic category of both accented and unaccented words (e.g., noun, verb, adjective).


Materials - Data Selection

Utterances of 16 mothers directed to their infants (< one year) were included:

    Seven mothers recorded at the Baby Speech Lab (BSL) at the University of Konstanz and one mother recorded in a home-environment (henceforth, BSL subset).

    Recordings of mother-infant dyads (in both subsets) took place in natural play situations.

    In total, the KIDS Corpus comprises 10min 12sec of speech, 2014 words (513 word forms, counting all inflected forms of a word as a separate type), 524 IPs, 832 pitch accents.

    Table 1 shows more detailed information on each mother.

Table 1: Information on data, split by mothers.
^ To Top

Annotation - Data Analysis

Two trained annotators (first and last author of Zahner, Schönhuber, Grijzenhout & Braun (2016)) labeled the corpus together, using praat (Boersma & Weenink 2014). For each wav-file, a corresponding TextGrid-file consisting of ten tiers was created, see example annotation below. In the following, we specify the information that is provided on each tier. We also indicate whether the information is annotated on an interval or a point tier.

  1. Intended representation of utterance (orthographic transcription in German; interval tier)

  2. Actual realization of utterance (orthographic transcription; e.g., "habn" for "haben"; interval tier)

  3. Word category of both accented and unaccented words (simple categories, e.g., "adj" for adjective; (word category labels); point tier)
  4. Word category of both accented and unaccented words (more detailed categories, following the guidelines of STTS (Stuttgart - Tübingen Tagset), (Schiller, Teufel, Stöckert & Thielen 1999), e.g., "ADJA" for adjective in attributive position or "ADJD" for an adjective used predicatively or adverbially) (word category labels); point tier)

  5. Accented syllables (orthographic transcription; interval tier)

  6. Word-prosodic structure of the accented word (point tier)
    • S: primary stressed syllable (e.g., S for "Maus")

    • W: unstressed, weak syllable (e.g., SW for "'Mama" or WS for "Mu'sik")

    • s: secondary stressed syllable (typically in compounds, e.g., SWsW for "'Sandel,eimer")

  7. Prosodic domain of accent (indication of availability of unaccented syllables to the left or right of the accented syllable (=a) on which leading or trailing tones could be realized; 1 = unaccented syllables available; 0 = no unaccented syllables available; analysis is performed irrespective of word boundaries; point tier)

    • 0a0: accented syllable is immediately surrounded by other accented syllables or boundary tones (e.g., % "NEIN" %; capitalization indicates the accented syllable; % indicates an IP boundary)

    • 1a1: accented syllable has at least one unaccented syllable to its right and its left (e.g., "geSCHLAfen", "was MACHST du", "der RAsselt")

    • 0a1: accented syllable has at least one unaccented syllable to its right and is preceded by another pitch accent or boundary tone (e.g., % "KAtze", % "SCHAU mal", % "HINsetzen")

    • 1a0: accented syllable has at least one unaccented syllable to its left and is immediately followed by another pitch accent or boundary tone (e.g., "MuSIK" %, "mit SAND" %)

  8. GToBI annotation (pitch accents, IP and ip boundaries are annotated; point tier)
  9. Tritonal pattern analysis (for 1a1-condition (accented material available on both sides of the accented syllable): indication of tonal surrounding on both sides of the accented syllable; point tier). For more details on the motivation for this analysis and precise labeling conventions see Zahner, Pohl & Braun (2015) (paper) and Zahner, Schönhuber, Grijzenhout & Braun (2016) (paper).

    • Similar to ToBI (Silverman et al. 1992), the tone associated with the accented syllable is marked by an asterisk (e.g., LH*L, HH*L)

    • If the preceding or following tonal target is not associated with a syllable adjacent to the accented syllable, this separation of tonal targets is indicated by ".." (e.g., LH*..L)

  10. Comments (e.g., "overlaid speech", "onomatopoetica", "breathy voice", "extraordinary wide/narrow pitch range"; point tier)

Example Annotation

Figure 1: An example annotation showing a smoothed pitch contour and all ten annotation layers, together with the corresponding sound file. play sound

More details on the data analysis can be found in the paper introducing the KIDS Corpus (Zahner, Schönhuber, Grijzenhout & Braun 2016) (paper).

^ To Top

Results

In the following, we present some figures and tables on different distribution frequencies, which are also found in Zahner, Schönhuber, Grijzenhout & Braun (2016).

Most of the words that are used by the mothers in the KIDS Corpus are verbs (23%), followed in frequency by pronouns (19%), adverbs (18%) and nouns (12%), see Figure 2. Within the 524 IPs, 832 words are accented. Thus, an IP contains 1.6 pitch accents on average. 41% of the words carry a pitch accent (832 out of 2014). In total, 26% of the accented words are nouns, 25% are verbs, 16% are adverbs, and 10% are adjectives, see Figure 2. Most of the accented words follow a typical Germanic word-prosodic structure: 52% are monosyllabic (S), followed in frequency by trochaic words (SW, 30%). Other structures are considerably less frequent (e.g., WS: 4%, SWW: 4%).

Figure 2: Frequency distribution of word categories across both accented words (n = 832; grey) and unaccented words (n = 2014; white);
y-axis shows absolute counts.

Table 2 shows the distribution of boundary tones in the KIDS Corpus, Table 3 the distribution of pitch accents. The corpus comprises 524 initial and 524 final boundary tones. In the majority of cases, the utterances start with a low boundary tone (69% of the IPs). The most frequent final boundary tone is L-% (46% of the IPs). The next frequent patterns are a high plateau (H-%, 25%) and a low rise (L-H%, 13%). Incomplete falls (!H-%, 7%) and high rises (H-^H%, 7%) are least frequent.

Overall, the most frequent accent types are H* and L+H*, each occurring in more than 25% of the accents overall. The low-pitched monotonal accent (L*) are also common (18%). Note, however, that L* accents are often followed by a high tone, in particular a high boundary tone (see analysis of three-tone-sequences in Table 4). In the CHILDES recordings, L* accents are significantly more frequent than in the utterances recorded in our lab (25% vs. 13%; p=0.003 in a glmer with dataset as fixed factor and mother as crossed-random factor). Accents with a high leading tone (H+L*, H+!H*) are only sparsely represented in the corpus (6% and 2%, respectively).

Table 2: Frequency distribution of boundary tones (GToBI), for the whole KIDS Corpus and the two subsets.

Table 3: Frequency distribution of pitch accent types (GToBI), for the whole KIDS Corpus and the two subsets.

Table 4 shows the results for the three-tone-sequences in accents that are surrounded by at least one unaccented syllable on both sides, i.e., the 1a1-cases (see annotation on tier 9 in the TextGrid-file). For the sake of clarity, the results are simplified in two respects: First, Table 4 ignores scaling differences, i.e., an L+H* !H-% is counted as LH*L, see Figure 1. Second, it is not taken into account whether a preceding or following pitch target is associated with a syllable adjacent to the accented syllable or is realized later, i.e., an LH*..L notation is counted as LH*L here, see Figure 1. In total, the relevant 1a1-cases account for more than half of the data (426 accented syllables). By far the most frequent accentual pattern is a rising-falling movement (LH*L), which occurs in 34% of the cases. The second most common accentual pattern is LL*H. i.e., a low accentual tone preceded by a low and followed by a high tone, occurring in 14% of the cases. Regarding the three-tone-sequence LL*H in the 1a1-cases of our corpus, we again observed a distribution difference across the two subsets: LL*H patterns are significantly more frequent in the CHILDES subset than in the BSL recordings (20% vs. 11%, p=0.04 in a glmer with dataset as fixed factor and mother as crossed-random factor).

Table 4: Frequency distribution of three-tone sequence analysis in condition 1a1; for the whole KIDS Corpus and the two subsets; T stands for "tone" and TT*T comprises both monotonal three-tone-patterns (HH*H and LL*L); LM*H and HM*L stand for staircase-patterns going up and down, respectively.

^ To Top

Access to the KIDS Corpus

The TextGrid-files can be downloaded here (zip). For more information on the annotation and analysis, please contact Katharina Zahner (firstname dot lastname at uni-konstanz dot de).
If you wish to gain access to the wav-files, please fill in the following form (German version or English version) and send it to Katharina Zahner. We will respond to your request as soon as possible.


How to Cite the KIDS Corpus

In case of any publication based on data of the KIDS Corpus, please cite the corpus as indicated below.

Zahner, K., Schönhuber, M., Grijzenhout, J. & Braun, B. (2016). Konstanz prosodically annotated infant-directed speech corpus (KIDS Corpus). Proceedings of the 8th International Conference on Speech Prosody. Boston, USA. (paper)

Further Publications

    Auriga, I. (2016). Intonation in infant-directed-speech - Zur Funktion prosodischer Kategorien [Intonation in infant-directed speech - A functional analysis of prosodic categories]. (BA Thesis), Department of Linguistics, University of Konstanz, Konstanz.
    Zahner, K., Schönhuber, M., Grijzenhout, J. & Braun, B. (in prep). Prosodic constructions in German infant-directed speech.
    Zahner, K., Schönhuber, M. & Braun, B. The limits of metrical segmentation: intonation modulates infants' extraction of embedded trochees. Journal of Child Language, available on CJO 2015 doi:10.1017/ S0305000915000744. (manuscript)
    Zahner, K., Pohl, M. & Braun, B. (2015). Pitch accent distribution in German infant-directed speech. Proceedings of Interspeech. Dresden, Germany. (paper)
^ To Top

KIDS Corpus Project Members


Funding

The project was partly funded by the Excellence Initiative of the University of Konstanz (Ling VisAnn-Projekt 663/13).

^ To Top

References Cited on this Page

    Auriga, I. (2016). Intonation in infant-directed-speech - Zur Funktion prosodischer Kategorien [Intonation in infant-directed speech - A functional analysis of prosodic categories]. (BA Thesis), Department of Linguistics, University of Konstanz, Konstanz.
    Boersma, P., & Weenink, D. (2014). Praat: doing phonetics by computer. Version 5.3.84 [Computer program].
    Grice, M., Baumann, S., & Benzmüller, R. (2005). German intonation in autosegmental-metrical phonology. In J. Sun-Ah (Ed.), Prosodic Typology. The Phonology of Intonation and Phrasing (pp. 55-83). Oxford: Oxford University Press.
    MacWhinney, B. (2000). The CHILDES project: tools for analyzing talk. (3rd ed. Vol. 2: The Database). Mahwah, NJ: Lawrence Erlbaum Associates.
    Rose, Y., MacWhinney, B., Byrne, R., Hedlund, G., Maddocks, K., O’Brien, P., & Wareham, T. (2006). Introducing Phon: A software solution for the study of phonological acquisition. Proceedings of the 30th Annual Boston University Conference on Language Development (BUCLD), Somerville, MA.
    Rose, Y., & MacWhinney, B. (2014). The PhonBank Project: Data and Software-Assisted Methods for the Study of Phonology and Phonological Development. In J. Durand, U. Gut, & G. Kristoffersen (Eds.), The Oxford Handbook of Corpus Phonology (pp. 308-401). Oxford: Oxford University Press.
    Schiller, A., Teufel, S., Stöckert, Ch., Thielen, Ch. (1999). Guidelines für das Tagging deutscher Textcorpora mit STTS (Kleines und großes Tagset). Technical Report, Universities of Stuttgart and Tübingen.
    Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumber, J., & Hirschberg, J. (1992). ToBI: A standard for labeling English intonation. Proceedings of the International Conference on Spoken Language Processing. Banff.
    Zahner, K., Pohl, M., & Braun, B. (2015). Pitch accent distribution in German infant-directed speech. Proceedings of Interspeech. Dresden, Germany. (paper)
    Zahner, K., Schönhuber, M., Grijzenhout, J. & Braun, B. (2016). Konstanz prosodically annotated infant-directed speech corpus (KIDS Corpus). Proceedings of the 8th International Conference on Speech Prosody. Boston, USA. (paper)
    Zahner, K., Schönhuber, M. & Braun, B. The limits of metrical segmentation: intonation modulates infants' extraction of embedded trochees. Journal of Child Language, available on CJO 2015 doi:10.1017/ S0305000915000744. (manuscript)


Acknowledgements

We thank Isabelle Auriga, Andrea Beeken, Sophie Egger, Angela James and Stephanie Gustedt for help with preparation and analyses of the data. We also thank Clara Huttenlauch for writing PRAAT scripts for TextGrid preparation and analyses as well as designing this homepage. We further appreciate discussion of data at the DIMA (Annotation Guidelines for German Intonation) meeting in Potsdam (March 2015). We owe special thanks to Brian MacWhinney and Yvan Rose from the CHILDES team for their support with data conversion in order to make the data available on the CHILDES online database soon. Last but definitely not least, we thank all mothers and their infants for their participation in our study.


last updated on: 27.05.2016