We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Utilizing coarse-grained data in low-data settings for event extraction

Authors: Osman Mutlu
Abstract: Annotating text data for event information extraction systems is hard, expensive, and error-prone. We investigate the feasibility of integrating coarse-grained data (document or sentence labels), which is far more feasible to obtain, instead of annotating more documents. We utilize a multi-task model with two auxiliary tasks, document and sentence binary classification, in addition to the main task of token classification. We perform a series of experiments with varying data regimes for the aforementioned integration. Results show that while introducing extra coarse-grained data offers greater improvement and robustness, a gain is still possible with only the addition of negative documents that have no information on any event.
Comments: A Dissertation Submitted to the Graduate School of Sciences and Engineering in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science and Engineering
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2205.05468 [cs.CL]
  (or arXiv:2205.05468v1 [cs.CL] for this version)

Submission history

From: Osman Mutlu [view email]
[v1] Wed, 11 May 2022 13:07:42 GMT (637kb,D)

Link back to: arXiv, form interface, contact.