References & Citations
Computer Science > Computation and Language
Title: Cross-Lingual Open-Domain Question Answering with Answer Sentence Generation
(Submitted on 14 Oct 2021 (v1), last revised 19 Dec 2022 (this version, v3))
Abstract: Open-Domain Generative Question Answering has achieved impressive performance in English by combining document-level retrieval with answer generation. These approaches, which we refer to as GenQA, can generate complete sentences, effectively answering both factoid and non-factoid questions. In this paper, we extend GenQA to the multilingual and cross-lingual settings. For this purpose, we first introduce GenTyDiQA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian. Based on GenTyDiQA, we design a cross-lingual generative model that produces full-sentence answers by exploiting passages written in multiple languages, including languages different from the question. Our cross-lingual generative system outperforms answer sentence selection baselines for all 5 languages and monolingual generative pipelines for three out of five languages studied.
Submission history
From: Luca Soldaini [view email][v1] Thu, 14 Oct 2021 04:36:29 GMT (483kb,D)
[v2] Sun, 22 May 2022 22:10:07 GMT (155kb,D)
[v3] Mon, 19 Dec 2022 05:53:11 GMT (159kb,D)
Link back to: arXiv, form interface, contact.