We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Label-Wise Document Pre-Training for Multi-Label Text Classification

Abstract: A major challenge of multi-label text classification (MLTC) is to stimulatingly exploit possible label differences and label correlations. In this paper, we tackle this challenge by developing Label-Wise Pre-Training (LW-PT) method to get a document representation with label-aware information. The basic idea is that, a multi-label document can be represented as a combination of multiple label-wise representations, and that, correlated labels always cooccur in the same or similar documents. LW-PT implements this idea by constructing label-wise document classification tasks and trains label-wise document encoders. Finally, the pre-trained label-wise encoder is fine-tuned with the downstream MLTC task. Extensive experimental results validate that the proposed method has significant advantages over the previous state-of-the-art models and is able to discover reasonable label relationship. The code is released to facilitate other researchers.
Comments: Accepted to NLPCC 2020
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2008.06695 [cs.CL]
  (or arXiv:2008.06695v1 [cs.CL] for this version)

Submission history

From: Han Liu [view email]
[v1] Sat, 15 Aug 2020 10:34:27 GMT (115kb)

Link back to: arXiv, form interface, contact.