We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Large-Scale Multi-Document Summarization with Information Extraction and Compression

Abstract: We develop an abstractive summarization framework independent of labeled data for multiple heterogeneous documents. Unlike existing multi-document summarization methods, our framework processes documents telling different stories instead of documents on the same topic. We also enhance an existing sentence fusion method with a uni-directional language model to prioritize fused sentences with higher sentence probability with the goal of increasing readability. Lastly, we construct a total of twelve dataset variations based on CNN/Daily Mail and the NewsRoom datasets, where each document group contains a large and diverse collection of documents to evaluate the performance of our model in comparison with other baseline systems. Our experiments demonstrate that our framework outperforms current state-of-the-art methods in this more generic setting.
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as: arXiv:2205.00548 [cs.CL]
  (or arXiv:2205.00548v1 [cs.CL] for this version)

Submission history

From: Ning Wang [view email]
[v1] Sun, 1 May 2022 19:49:15 GMT (666kb,D)

Link back to: arXiv, form interface, contact.