We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: S-SimCSE: Sampled Sub-networks for Contrastive Learning of Sentence Embedding

Abstract: Contrastive learning has been studied for improving the performance of learning sentence embeddings. The current state-of-the-art method is the SimCSE, which takes dropout as the data augmentation method and feeds a pre-trained transformer encoder the same input sentence twice. The corresponding outputs, two sentence embeddings derived from the same sentence with different dropout masks, can be used to build a positive pair. A network being applied with a dropout mask can be regarded as a sub-network of itsef, whose expected scale is determined by the dropout rate. In this paper, we push sub-networks with different expected scales learn similar embedding for the same sentence. SimCSE failed to do so because they fixed the dropout rate to a tuned hyperparameter. We achieve this by sampling dropout rate from a distribution eatch forward process. As this method may make optimization harder, we also propose a simple sentence-wise mask strategy to sample more sub-networks. We evaluated the proposed S-SimCSE on several popular semantic text similarity datasets. Experimental results show that S-SimCSE outperforms the state-of-the-art SimCSE more than $1\%$ on BERT$_{base}$
Comments: 2 pages
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2111.11750 [cs.CL]
  (or arXiv:2111.11750v2 [cs.CL] for this version)

Submission history

From: Junlei Zhang [view email]
[v1] Tue, 23 Nov 2021 09:52:45 GMT (224kb)
[v2] Wed, 24 Nov 2021 09:20:44 GMT (224kb)

Link back to: arXiv, form interface, contact.