Return of the RNN: Residual Recurrent Networks for Invertible Sentence Embeddings

Wilkerson, Jeremy

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2303

Computer Science > Computation and Language

Title: Return of the RNN: Residual Recurrent Networks for Invertible Sentence Embeddings

Authors: Jeremy Wilkerson

(Submitted on 23 Mar 2023 (v1), last revised 6 Apr 2023 (this version, v2))

Abstract: This study presents a novel model for invertible sentence embeddings using a residual recurrent network trained on an unsupervised encoding task. Rather than the probabilistic outputs common to neural machine translation models, our approach employs a regression-based output layer to reconstruct the input sequence's word vectors. The model achieves high accuracy and fast training with the ADAM optimizer, a significant finding given that RNNs typically require memory units, such as LSTMs, or second-order optimization methods. We incorporate residual connections and introduce a "match drop" technique, where gradients are calculated only for incorrect words. Our approach demonstrates potential for various natural language processing applications, particularly in neural network-based systems that require high-quality sentence embeddings.

Comments:	Adds descriptions of the use of dropout, the use of custom C++ code, the removal of non-English sentences, other minor changes
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2303.13570 [cs.CL]
	(or arXiv:2303.13570v2 [cs.CL] for this version)

Submission history

From: Jeremy Wilkerson [view email]
[v1] Thu, 23 Mar 2023 15:59:06 GMT (60kb,D)
[v2] Thu, 6 Apr 2023 00:22:17 GMT (60kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2303.13570

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Return of the RNN: Residual Recurrent Networks for Invertible Sentence Embeddings

Submission history