We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Understanding the Properties of Generated Corpora

Abstract: Models for text generation have become focal for many research tasks and especially for the generation of sentence corpora. However, understanding the properties of an automatically generated text corpus remains challenging. We propose a set of tools that examine the properties of generated text corpora. Applying these tools on various generated corpora allowed us to gain new insights into the properties of the generative models. As part of our characterization process, we found remarkable differences in the corpora generated by two leading generative technologies.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2206.11219 [cs.CL]
  (or arXiv:2206.11219v2 [cs.CL] for this version)

Submission history

From: Naama Zwerdling [view email]
[v1] Wed, 22 Jun 2022 17:13:52 GMT (7712kb,D)
[v2] Thu, 27 Oct 2022 10:58:08 GMT (7713kb,D)

Link back to: arXiv, form interface, contact.