At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization

Zhou, Qingyu; Wei, Furu; Zhou, Ming

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2004

Change to browse by:

Computer Science > Computation and Language

Title: At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization

Authors: Qingyu Zhou, Furu Wei, Ming Zhou

(Submitted on 6 Apr 2020 (v1), last revised 26 Oct 2020 (this version, v2))

Abstract: Extractive methods have been proven effective in automatic document summarization. Previous works perform this task by identifying informative contents at sentence level. However, it is unclear whether performing extraction at sentence level is the best solution. In this work, we show that unnecessity and redundancy issues exist when extracting full sentences, and extracting sub-sentential units is a promising alternative. Specifically, we propose extracting sub-sentential units based on the constituency parsing tree. A neural extractive model which leverages the sub-sentential information and extracts them is presented. Extensive experiments and analyses show that extracting sub-sentential units performs competitively comparing to full sentence extraction under the evaluation of both automatic and human evaluations. Hopefully, our work could provide some inspiration of the basic extraction units in extractive summarization for future research.

Comments:	To appear at COLING 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2004.02664 [cs.CL]
	(or arXiv:2004.02664v2 [cs.CL] for this version)

Submission history

From: Qingyu Zhou [view email]
[v1] Mon, 6 Apr 2020 13:35:10 GMT (354kb,D)
[v2] Mon, 26 Oct 2020 08:35:19 GMT (351kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2004.02664

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization

Submission history