We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Causally Estimating the Sensitivity of Neural NLP Models to Spurious Features

Abstract: Recent work finds modern natural language processing (NLP) models relying on spurious features for prediction. Mitigating such effects is thus important. Despite this need, there is no quantitative measure to evaluate or compare the effects of different forms of spurious features in NLP. We address this gap in the literature by quantifying model sensitivity to spurious features with a causal estimand, dubbed CENT, which draws on the concept of average treatment effect from the causality literature. By conducting simulations with four prominent NLP models -- TextRNN, BERT, RoBERTa and XLNet -- we rank the models against their sensitivity to artificial injections of eight spurious features. We further hypothesize and validate that models that are more sensitive to a spurious feature will be less robust against perturbations with this feature during inference. Conversely, data augmentation with this feature improves robustness to similar perturbations. We find statistically significant inverse correlations between sensitivity and robustness, providing empirical support for our hypothesis.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2110.07159 [cs.CL]
  (or arXiv:2110.07159v1 [cs.CL] for this version)

Submission history

From: Yunxiang Zhang [view email]
[v1] Thu, 14 Oct 2021 05:26:08 GMT (370kb,D)

Link back to: arXiv, form interface, contact.