We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Data-to-text Generation by Splicing Together Nearest Neighbors

Abstract: We propose to tackle data-to-text generation tasks by directly splicing together retrieved segments of text from "neighbor" source-target pairs. Unlike recent work that conditions on retrieved neighbors but generates text token-by-token, left-to-right, we learn a policy that directly manipulates segments of neighbor text, by inserting or replacing them in partially constructed generations. Standard techniques for training such a policy require an oracle derivation for each generation, and we prove that finding the shortest such derivation can be reduced to parsing under a particular weighted context-free grammar. We find that policies learned in this way perform on par with strong baselines in terms of automatic and human evaluation, but allow for more interpretable and controllable generation.
Comments: EMNLP 2021; figures updated/improved
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2101.08248 [cs.CL]
  (or arXiv:2101.08248v4 [cs.CL] for this version)

Submission history

From: Sam Wiseman [view email]
[v1] Wed, 20 Jan 2021 18:43:11 GMT (124kb,D)
[v2] Fri, 29 Jan 2021 18:44:33 GMT (125kb,D)
[v3] Wed, 15 Sep 2021 15:46:16 GMT (250kb,D)
[v4] Thu, 28 Oct 2021 20:19:35 GMT (418kb,D)

Link back to: arXiv, form interface, contact.