We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Text Simplification for Comprehension-based Question-Answering

Abstract: Text simplification is the process of splitting and rephrasing a sentence to a sequence of sentences making it easier to read and understand while preserving the content and approximating the original meaning. Text simplification has been exploited in NLP applications like machine translation, summarization, semantic role labeling, and information extraction, opening a broad avenue for its exploitation in comprehension-based question-answering downstream tasks. In this work, we investigate the effect of text simplification in the task of question-answering using a comprehension context. We release Simple-SQuAD, a simplified version of the widely-used SQuAD dataset.
Firstly, we outline each step in the dataset creation pipeline, including style transfer, thresholding of sentences showing correct transfer, and offset finding for each answer. Secondly, we verify the quality of the transferred sentences through various methodologies involving both automated and human evaluation. Thirdly, we benchmark the newly created corpus and perform an ablation study for examining the effect of the simplification process in the SQuAD-based question answering task. Our experiments show that simplification leads to up to 2.04% and 1.74% increase in Exact Match and F1, respectively. Finally, we conclude with an analysis of the transfer process, investigating the types of edits made by the model, and the effect of sentence length on the transfer model.
Comments: Accepted at W-NUT Workshop to be held at EMNLP 2021 as a long paper. Also presented at DeMAL Workshop held at the Web Conference (WWW) 2021
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as: arXiv:2109.13984 [cs.CL]
  (or arXiv:2109.13984v1 [cs.CL] for this version)

Submission history

From: Kartikey Pant [view email]
[v1] Tue, 28 Sep 2021 18:48:00 GMT (286kb,D)

Link back to: arXiv, form interface, contact.