We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Challenges in Developing LRs for Non-Scheduled Languages: A Case of Magahi

Authors: Ritesh Kumar
Abstract: Magahi is an Indo-Aryan Language, spoken mainly in the Eastern parts of India. Despite having a significant number of speakers, there has been virtually no language resource (LR) or language technology (LT) developed for the language, mainly because of its status as a non-scheduled language. The present paper describes an attempt to develop an annotated corpus of Magahi. The data is mainly taken from a couple of blogs in Magahi, some collection of stories in Magahi and the recordings of conversation in Magahi and it is annotated at the POS level using BIS tagset.
Subjects: Computation and Language (cs.CL)
Journal reference: Proceedings of 5th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Pozna\'n, Poland, pp. 60-64, 2011
Cite as: arXiv:2111.15322 [cs.CL]
  (or arXiv:2111.15322v1 [cs.CL] for this version)

Submission history

From: Ritesh Kumar [view email]
[v1] Tue, 30 Nov 2021 12:07:23 GMT (288kb)

Link back to: arXiv, form interface, contact.