Advances of Transformer-Based Models for News Headline Generation

Bukhtiyarov, Alexey; Gusev, Ilya

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2007

Change to browse by:

Computer Science > Computation and Language

Title: Advances of Transformer-Based Models for News Headline Generation

Authors: Alexey Bukhtiyarov, Ilya Gusev

(Submitted on 9 Jul 2020 (v1), last revised 27 Jul 2020 (this version, v2))

Abstract: Pretrained language models based on Transformer architecture are the reason for recent breakthroughs in many areas of NLP, including sentiment analysis, question answering, named entity recognition. Headline generation is a special kind of text summarization task. Models need to have strong natural language understanding that goes beyond the meaning of individual words and sentences and an ability to distinguish essential information to succeed in it. In this paper, we fine-tune two pretrained Transformer-based models (mBART and BertSumAbs) for that task and achieve new state-of-the-art results on the RIA and Lenta datasets of Russian news. BertSumAbs increases ROUGE on average by 2.9 and 2.0 points respectively over previous best score achieved by Phrase-Based Attentional Transformer and CopyNet.

Comments:	Version 2; Accepted to AINL 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2007.05044 [cs.CL]
	(or arXiv:2007.05044v2 [cs.CL] for this version)

Submission history

From: Alexey Bukhtiyarov [view email]
[v1] Thu, 9 Jul 2020 19:34:18 GMT (37kb)
[v2] Mon, 27 Jul 2020 06:54:07 GMT (37kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2007.05044

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Advances of Transformer-Based Models for News Headline Generation

Submission history