We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Not to Overfit or Underfit? A Study of Domain Generalization in Question Answering

Abstract: Machine learning models are prone to overfitting their source (training) distributions, which is commonly believed to be why they falter in novel target domains. Here we examine the contrasting view that multi-source domain generalization (DG) is in fact a problem of mitigating source domain underfitting: models not adequately learning the signal in their multi-domain training data. Experiments on a reading comprehension DG benchmark show that as a model gradually learns its source domains better -- using known methods such as knowledge distillation from a larger model -- its zero-shot out-of-domain accuracy improves at an even faster rate. Improved source domain learning also demonstrates superior generalization over three popular domain-invariant learning methods that aim to counter overfitting.
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2205.07257 [cs.CL]
  (or arXiv:2205.07257v1 [cs.CL] for this version)

Submission history

From: Md Arafat Sultan [view email]
[v1] Sun, 15 May 2022 10:53:40 GMT (53kb,D)

Link back to: arXiv, form interface, contact.