Transformer-F: A Transformer network with effective methods for learning universal sentence representation

Shi, Yu

Full-text links:

Download:

PDF only

Current browse context:

cs.CL

< prev | next >

new | recent | 2107

Computer Science > Computation and Language

Title: Transformer-F: A Transformer network with effective methods for learning universal sentence representation

Authors: Yu Shi

(Submitted on 2 Jul 2021)

Abstract: The Transformer model is widely used in natural language processing for sentence representation. However, the previous Transformer-based models focus on function words that have limited meaning in most cases and could merely extract high-level semantic abstraction features. In this paper, two approaches are introduced to improve the performance of Transformers. We calculated the attention score by multiplying the part-of-speech weight vector with the correlation coefficient, which helps extract the words with more practical meaning. The weight vector is obtained by the input text sequence based on the importance of the part-of-speech. Furthermore, we fuse the features of each layer to make the sentence representation results more comprehensive and accurate. In experiments, we demonstrate the effectiveness of our model Transformer-F on three standard text classification datasets. Experimental results show that our proposed model significantly boosts the performance of text classification as compared to the baseline model. Specifically, we obtain a 5.28% relative improvement over the vanilla Transformer on the simple tasks.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2107.00653 [cs.CL]
	(or arXiv:2107.00653v1 [cs.CL] for this version)

Submission history

From: Yu Shi [view email]
[v1] Fri, 2 Jul 2021 03:20:11 GMT (535kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2107.00653

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Transformer-F: A Transformer network with effective methods for learning universal sentence representation

Submission history