We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: hauWE: Hausa Words Embedding for Natural Language Processing

Abstract: Words embedding (distributed word vector representations) have become an essential component of many natural language processing (NLP) tasks such as machine translation, sentiment analysis, word analogy, named entity recognition and word similarity. Despite this, the only work that provides word vectors for Hausa language is that of Bojanowski et al. [1] trained using fastText, consisting of only a few words vectors. This work presents words embedding models using Word2Vec's Continuous Bag of Words (CBoW) and Skip Gram (SG) models. The models, hauWE (Hausa Words Embedding), are bigger and better than the only previous model, making them more useful in NLP tasks. To compare the models, they were used to predict the 10 most similar words to 30 randomly selected Hausa words. hauWE CBoW's 88.7% and hauWE SG's 79.3% prediction accuracy greatly outperformed Bojanowski et al. [1]'s 22.3%.
Comments: In Proceedings of the 2019 2nd International Conference of the IEEE Nigeria Computer Chapter
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
DOI: 10.1109/NigeriaComputConf45974.2019.8949674
Cite as: arXiv:1911.10708 [cs.CL]
  (or arXiv:1911.10708v1 [cs.CL] for this version)

Submission history

From: Idris Abdulmumin [view email]
[v1] Mon, 25 Nov 2019 05:46:56 GMT (318kb)

Link back to: arXiv, form interface, contact.