We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Letters From the Past: Modeling Historical Sound Change Through Diachronic Character Embeddings

Abstract: While a great deal of work has been done on NLP approaches to lexical semantic change detection, other aspects of language change have received less attention from the NLP community. In this paper, we address the detection of sound change through historical spelling. We propose that a sound change can be captured by comparing the relative distance through time between their distributions using PPMI character embeddings. We verify this hypothesis in synthetic data and then test the method's ability to trace the well-known historical change of lenition of plosives in Danish historical sources. We show that the models are able to identify several of the changes under consideration and to uncover meaningful contexts in which they appeared. The methodology has the potential to contribute to the study of open questions such as the relative chronology of sound shifts and their geographical distribution.
Comments: Accepted as long paper at ACL 2022
Subjects: Computation and Language (cs.CL)
MSC classes: 68T50 (Primary)
ACM classes: I.2.7
Cite as: arXiv:2205.08256 [cs.CL]
  (or arXiv:2205.08256v1 [cs.CL] for this version)

Submission history

From: Patrizia Paggio [view email]
[v1] Tue, 17 May 2022 11:57:17 GMT (682kb,D)

Link back to: arXiv, form interface, contact.