We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Explaining Prediction Uncertainty of Pre-trained Language Models by Detecting Uncertain Words in Inputs

Abstract: Estimating the predictive uncertainty of pre-trained language models is important for increasing their trustworthiness in NLP. Although many previous works focus on quantifying prediction uncertainty, there is little work on explaining the uncertainty. This paper pushes a step further on explaining uncertain predictions of post-calibrated pre-trained language models. We adapt two perturbation-based post-hoc interpretation methods, Leave-one-out and Sampling Shapley, to identify words in inputs that cause the uncertainty in predictions. We test the proposed methods on BERT and RoBERTa with three tasks: sentiment classification, natural language inference, and paraphrase identification, in both in-domain and out-of-domain settings. Experiments show that both methods consistently capture words in inputs that cause prediction uncertainty.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2201.03742 [cs.CL]
  (or arXiv:2201.03742v1 [cs.CL] for this version)

Submission history

From: Hanjie Chen [view email]
[v1] Tue, 11 Jan 2022 02:04:50 GMT (289kb,D)

Link back to: arXiv, form interface, contact.