Counterfactual Data Augmentation improves Factuality of Abstractive Summarization

Rajagopal, Dheeraj; Shakeri, Siamak; Santos, Cicero Nogueira dos; Hovy, Eduard; Chang, Chung-Ching

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2205

Change to browse by:

Computer Science > Computation and Language

Title: Counterfactual Data Augmentation improves Factuality of Abstractive Summarization

Authors: Dheeraj Rajagopal, Siamak Shakeri, Cicero Nogueira dos Santos, Eduard Hovy, Chung-Ching Chang

(Submitted on 25 May 2022)

Abstract: Abstractive summarization systems based on pretrained language models often generate coherent but factually inconsistent sentences. In this paper, we present a counterfactual data augmentation approach where we augment data with perturbed summaries that increase the training data diversity. Specifically, we present three augmentation approaches based on replacing (i) entities from other and the same category and (ii) nouns with their corresponding WordNet hypernyms. We show that augmenting the training data with our approach improves the factual correctness of summaries without significantly affecting the ROUGE score. We show that in two commonly used summarization datasets (CNN/Dailymail and XSum), we improve the factual correctness by about 2.5 points on average

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.12416 [cs.CL]
	(or arXiv:2205.12416v1 [cs.CL] for this version)

Submission history

From: Dheeraj Rajagopal [view email]
[v1] Wed, 25 May 2022 00:00:35 GMT (116kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2205.12416

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Counterfactual Data Augmentation improves Factuality of Abstractive Summarization

Submission history