We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: IITK at the FinSim Task: Hypernym Detection in Financial Domain via Context-Free and Contextualized Word Embeddings

Abstract: In this paper, we present our approaches for the FinSim 2020 shared task on "Learning Semantic Representations for the Financial Domain". The goal of this task is to classify financial terms into the most relevant hypernym (or top-level) concept in an external ontology. We leverage both context-dependent and context-independent word embeddings in our analysis. Our systems deploy Word2vec embeddings trained from scratch on the corpus (Financial Prospectus in English) along with pre-trained BERT embeddings. We divide the test dataset into two subsets based on a domain rule. For one subset, we use unsupervised distance measures to classify the term. For the second subset, we use simple supervised classifiers like Naive Bayes, on top of the embeddings, to arrive at a final prediction. Finally, we combine both the results. Our system ranks 1st based on both the metrics, i.e., mean rank and accuracy.
Comments: 6 pages, 1 figure, 4 tables. Accepted at the Second Workshop on Financial Technology and Natural Language Processing (FinNLP-2020)
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Computational Finance (q-fin.CP)
Cite as: arXiv:2007.11201 [cs.CL]
  (or arXiv:2007.11201v1 [cs.CL] for this version)

Submission history

From: Vishal Keswani [view email]
[v1] Wed, 22 Jul 2020 04:56:23 GMT (291kb,D)

Link back to: arXiv, form interface, contact.