We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Ab Antiquo: Neural Proto-language Reconstruction

Abstract: Historical linguists have identified regularities in the process of historic sound change. The comparative method utilizes those regularities to reconstruct proto-words based on observed forms in daughter languages. Can this process be efficiently automated? We address the task of proto-word reconstruction, in which the model is exposed to cognates in contemporary daughter languages, and has to predict the proto word in the ancestor language. We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional methods applied to this task so far. Error analysis reveals variability in the ability of neural model to capture different phonological changes, correlating with the complexity of the changes. Analysis of learned embeddings reveals the models learn phonologically meaningful generalizations, corresponding to well-attested phonological shifts documented by historical linguistics.
Comments: Accepted as a long paper in NAACL21
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:1908.02477 [cs.CL]
  (or arXiv:1908.02477v3 [cs.CL] for this version)

Submission history

From: Shauli Ravfogel [view email]
[v1] Wed, 7 Aug 2019 08:03:08 GMT (572kb,D)
[v2] Thu, 11 Mar 2021 19:48:17 GMT (912kb,D)
[v3] Sun, 9 May 2021 18:35:15 GMT (921kb,D)

Link back to: arXiv, form interface, contact.