We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Morphological Analysis of Japanese Hiragana Sentences using the BI-LSTM CRF Model

Abstract: This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. This technique plays an essential role in downstream applications in Japanese natural language processing systems because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
Comments: 13 pages
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2201.03366 [cs.CL]
  (or arXiv:2201.03366v1 [cs.CL] for this version)

Submission history

From: Jun Izutsu [view email]
[v1] Mon, 10 Jan 2022 14:36:06 GMT (818kb)

Link back to: arXiv, form interface, contact.