Keywords lie far from the mean of all words in local vector space

Papagiannopoulou, Eirini; Tsoumakas, Grigorios; Papadopoulos, Apostolos N.

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2008

Change to browse by:

Computer Science > Computation and Language

Title: Keywords lie far from the mean of all words in local vector space

Authors: Eirini Papagiannopoulou, Grigorios Tsoumakas, Apostolos N. Papadopoulos

(Submitted on 21 Aug 2020)

Abstract: Keyword extraction is an important document process that aims at finding a small set of terms that concisely describe a document's topics. The most popular state-of-the-art unsupervised approaches belong to the family of the graph-based methods that build a graph-of-words and use various centrality measures to score the nodes (candidate keywords). In this work, we follow a different path to detect the keywords from a text document by modeling the main distribution of the document's words using local word vector representations. Then, we rank the candidates based on their position in the text and the distance between the corresponding local vectors and the main distribution's center. We confirm the high performance of our approach compared to strong baselines and state-of-the-art unsupervised keyword extraction methods, through an extended experimental study, investigating the properties of the local representations.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2008.09513 [cs.CL]
	(or arXiv:2008.09513v1 [cs.CL] for this version)

Submission history

From: Eirini Papagiannopoulou [view email]
[v1] Fri, 21 Aug 2020 14:42:33 GMT (1399kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2008.09513

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Keywords lie far from the mean of all words in local vector space

Submission history