We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

q-fin

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Applications

Title: Generative Synthesis of Insurance Datasets

Authors: Kevin Kuo
Abstract: One of the impediments in advancing actuarial research and developing open source assets for insurance analytics is the lack of realistic publicly available datasets. In this work, we develop a workflow for synthesizing insurance datasets leveraging CTGAN, a recently proposed neural network architecture for generating tabular data. Applying the proposed workflow to publicly available data in the domains of general insurance pricing and life insurance shock lapse modeling, we evaluate the synthesized datasets from a few perspectives: machine learning efficacy, distributions of variables, and stability of model parameters. This workflow is implemented via an R interface to promote adoption by researchers and data owners.
Subjects: Applications (stat.AP); Machine Learning (cs.LG); Risk Management (q-fin.RM)
Cite as: arXiv:1912.02423 [stat.AP]
  (or arXiv:1912.02423v2 [stat.AP] for this version)

Submission history

From: Kevin Kuo [view email]
[v1] Thu, 5 Dec 2019 07:49:31 GMT (84kb,D)
[v2] Thu, 6 Aug 2020 03:46:00 GMT (100kb,D)

Link back to: arXiv, form interface, contact.