We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

Abstract: We propose TANDA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer sentence selection, which is a well-known inference task in Question Answering. We built a large scale dataset to enable the transfer step, exploiting the Natural Questions dataset. Our approach establishes the state of the art on two well-known benchmarks, WikiQA and TREC-QA, achieving MAP scores of 92% and 94.3%, respectively, which largely outperform the previous highest scores of 83.4% and 87.5%, obtained in very recent work. We empirically show that TANDA generates more stable and robust models reducing the effort required for selecting optimal hyper-parameters. Additionally, we show that the transfer step of TANDA makes the adaptation step more robust to noise. This enables a more effective use of noisy datasets for fine-tuning. Finally, we also confirm the positive impact of TANDA in an industrial setting, using domain specific datasets subject to different types of noise.
Comments: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020), Oral Presentation
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:1911.04118 [cs.CL]
  (or arXiv:1911.04118v2 [cs.CL] for this version)

Submission history

From: Siddhant Garg [view email]
[v1] Mon, 11 Nov 2019 07:40:37 GMT (1771kb,D)
[v2] Wed, 20 Nov 2019 05:21:22 GMT (1806kb,D)

Link back to: arXiv, form interface, contact.