Metrics That Matter: A Practical Survey on Synthetic Data Evaluation

Jim Achterberg; Bram van Dijk; Saif Ul Islam; Gregory Epiphaniou; Carsten Maple; Marcel Haas; Marco Spruit

doi:10.31224/6576

##article.authors##

Jim Achterberg
Bram van Dijk Leiden University Medical Center https://orcid.org/0009-0002-9176-1608
Saif Ul Islam https://orcid.org/0000-0002-9546-4195
Gregory Epiphaniou
Carsten Maple
Marcel Haas https://orcid.org/0000-0003-2581-8370
Marco Spruit

DOI:

https://doi.org/10.31224/6576

Keywords:

synthetic data, AI, machine learning, Big data in Healthcare, Synthetic Data Generation, Healthcare AI, Synthetic Evaluation

Abstract

Assessing the quality of synthetic data (SD) is vital to determine whether it can provide a viable alternative to real data. A wide variety of metrics exist to examine the three archetypal dimensions of SD evaluation: realism (fidelity), task-specific usefulness (utility), and remaining disclosure risk (privacy). Current work in SD generation often relies on the ad-hoc selection of evaluation metrics without a clear justification, while the suitability of metrics strongly depend on the dataset and other contextual factors. This paper surveys the field of SD evaluation, provides guidance regarding metric selection based on four key questions pertaining to the task, goal, data type, and domain of SD, and provides general practical recommendations on SD evaluation. Finally, experiments on an illustrative dataset of electronic health records show how researchers can bring our insights and recommendations for SD evaluation into practice. By doing so, we aim to support researchers and practitioners seeking to generate and evaluate SD.

Downloads

Download data is not yet available.

Metrics That Matter: A Practical Survey on Synthetic Data Evaluation

##article.authors##

DOI:

Keywords:

Abstract

Downloads

Downloads

Posted

License

Latest preprints