We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: POS tagging, lemmatization and dependency parsing of West Frisian

Abstract: We present a lemmatizer/POS-tagger/dependency parser for West Frisian using a corpus of 44,714 words in 3,126 sentences that were annotated according to the guidelines of Universal Dependency version 2. POS tags were assigned to words by using a Dutch POS tagger that was applied to a literal word-by-word translation, or to sentences of a Dutch parallel text. Best results were obtained when using literal translations that were created by using the Frisian translation program Oersetter. Morphologic and syntactic annotations were generated on the basis of a literal Dutch translation as well. The performance of the lemmatizer/tagger/annotator when it was trained using default parameters was compared to the performance that was obtained when using the parameter values that were used for training the LassySmall UD 2.5 corpus. A significant improvement was found for `lemma'. The Frisian lemmatizer/PoS tagger/dependency parser is released as a web app and as a web service.
Comments: 6 pages, 2 figures, 6 tables
Subjects: Computation and Language (cs.CL); Machine Learning (stat.ML)
MSC classes: 68U15
ACM classes: J.5
Cite as: arXiv:2107.07974 [cs.CL]
  (or arXiv:2107.07974v1 [cs.CL] for this version)

Submission history

From: Wilbert Heeringa [view email]
[v1] Fri, 16 Jul 2021 15:41:37 GMT (92kb,D)
[v2] Wed, 28 Jul 2021 12:38:35 GMT (91kb,D)

Link back to: arXiv, form interface, contact.