We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Long-Span Summarization via Local Attention and Content Selection

Abstract: Transformer-based models have achieved state-of-the-art results in a wide range of natural language processing (NLP) tasks including document summarization. Typically these systems are trained by fine-tuning a large pre-trained model to the target task. One issue with these transformer-based models is that they do not scale well in terms of memory and compute requirements as the input length grows. Thus, for long document summarization, it can be challenging to train or fine-tune these models. In this work, we exploit large pre-trained transformer-based models and address long-span dependencies in abstractive summarization using two methods: local self-attention; and explicit content selection. These approaches are compared on a range of network configurations. Experiments are carried out on standard long-span summarization tasks, including Spotify Podcast, arXiv, and PubMed datasets. We demonstrate that by combining these methods, we can achieve state-of-the-art results on all three tasks in the ROUGE scores. Moreover, without a large-scale GPU card, our approach can achieve comparable or better results than existing approaches.
Comments: ACL 2021 (camera-ready)
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2105.03801 [cs.CL]
  (or arXiv:2105.03801v2 [cs.CL] for this version)

Submission history

From: Potsawee Manakul [view email]
[v1] Sat, 8 May 2021 23:53:03 GMT (5542kb,D)
[v2] Sat, 29 May 2021 11:23:29 GMT (5890kb,D)

Link back to: arXiv, form interface, contact.