We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: What do we expect from Multiple-choice QA Systems?

Abstract: The recent success of machine learning systems on various QA datasets could be interpreted as a significant improvement in models' language understanding abilities. However, using various perturbations, multiple recent works have shown that good performance on a dataset might not indicate performance that correlates well with human's expectations from models that "understand" language. In this work we consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets, and evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs. Our results show that the model clearly falls short of our expectations, and motivates a modified training approach that forces the model to better attend to the inputs. We show that the new training paradigm leads to a model that performs on par with the original model while better satisfying our expectations.
Comments: Findings of EMNLP 2020
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Journal reference: Findings of the Association for Computational Linguistics: EMNLP 2020 pg. 3547-3553
Cite as: arXiv:2011.10647 [cs.CL]
  (or arXiv:2011.10647v1 [cs.CL] for this version)

Submission history

From: Krunal Shah [view email]
[v1] Fri, 20 Nov 2020 21:27:10 GMT (7842kb,D)

Link back to: arXiv, form interface, contact.