Improving the Robustness of Summarization Systems with Dual Augmentation

Chen, Xiuying; Long, Guodong; Tao, Chongyang; Li, Mingzhe; Gao, Xin; Zhang, Chengqi; Zhang, Xiangliang

Full-text links:

Download:

Computer Science > Computation and Language

Title: Improving the Robustness of Summarization Systems with Dual Augmentation

Authors: Xiuying Chen, Guodong Long, Chongyang Tao, Mingzhe Li, Xin Gao, Chengqi Zhang, Xiangliang Zhang

(Submitted on 1 Jun 2023)

Abstract: A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input. In this work, we first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise. To create semantic-consistent substitutes, we propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models. Experimental results show that state-of-the-art summarization models have a significant decrease in performance on adversarial and noisy test sets. Next, we analyze the vulnerability of the summarization systems and explore improving the robustness by data augmentation. Specifically, the first brittleness factor we found is the poor understanding of infrequent words in the input. Correspondingly, we feed the encoder with more diverse cases created by SummAttacker in the input space. The other factor is in the latent space, where the attacked inputs bring more variations to the hidden states. Hence, we construct adversarial decoder input and devise manifold softmixing operation in hidden space to introduce more diversity. Experimental results on Gigaword and CNN/DM datasets demonstrate that our approach achieves significant improvements over strong baselines and exhibits higher robustness on noisy, attacked, and clean datasets.

Comments:	10 pages, 6 figures, ACL 2023 main coference
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2306.01090 [cs.CL]
	(or arXiv:2306.01090v1 [cs.CL] for this version)

Submission history

From: Xiuying Chen [view email]
[v1] Thu, 1 Jun 2023 19:04:17 GMT (259kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2306.01090

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Improving the Robustness of Summarization Systems with Dual Augmentation

Submission history