We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Computational Induction of Prosodic Structure

Authors: Dafydd Gibbon
Abstract: The present study has two goals relating to the grammar of prosody, understood as the rhythms and melodies of speech. First, an overview is provided of the computable grammatical and phonetic approaches to prosody analysis which use hypothetico-deductive methods and are based on learned hermeneutic intuitions about language. Second, a proposal is presented for an inductive grounding in the physical signal, in which prosodic structure is inferred using a language-independent method from the low-frequency spectrum of the speech signal. The overview includes a discussion of computational aspects of standard generative and post-generative models, and suggestions for reformulating these to form inductive approaches. Also included is a discussion of linguistic phonetic approaches to analysis of annotations (pairs of speech unit labels with time-stamps) of recorded spoken utterances. The proposal introduces the inductive approach of Rhythm Formant Theory (RFT) and the associated Rhythm Formant Analysis (RFA) method are introduced, with the aim of completing a gap in the linguistic hypothetico-deductive cycle by grounding in a language-independent inductive procedure of speech signal analysis. The validity of the method is demonstrated and applied to rhythm patterns in read-aloud Mandarin Chinese, finding differences from English which are related to lexical and grammatical differences between the languages, as well as individual variation. The overall conclusions are (1) that normative language-to-language phonological or phonetic comparisons of rhythm, for example of Mandarin and English, are too simplistic, in view of diverse language-internal factors due to genre and style differences as well as utterance dynamics, and (2) that language-independent empirical grounding of rhythm in the physical signal is called for.
Comments: 29 pages, 10 figures, code appendix, to appear in "Studies in Prosodic Grammar"
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: arXiv:1912.07050 [cs.CL]
  (or arXiv:1912.07050v1 [cs.CL] for this version)

Submission history

From: Dafydd Gibbon [view email]
[v1] Sun, 15 Dec 2019 14:38:58 GMT (1095kb)

Link back to: arXiv, form interface, contact.