event-icon
Description

Generative Adversarial Networks (GAN) could democratize access to healthcare data for machine learning (ML) by generating high quality synthetic data that mimics real data. We generated synthetic free-text medical data using three GAN algorithms; SeqGAN, MaliGAN and RankGAN, and evaluated them using various performance measures. SeqGAN generated synthetic data were superior to other GAN models, and could be used to replicate ML solutions with performance metrics statistically similar to decision models trained using real data.

Learning Objective: Fragmentation of Health Information Systems (HIS) and legal restrictions on sharing Patient Health Identifiers (PHI) limits access to healthcare data for research purposes.

The advent of Generative Adversarial Networks (GAN) capable of mimicking free-text medical data present considerable potential to democratize access to health data for machine learning.

SeqGAN generated synthetic data are of superior quality to other GAN models, and can be used to replicate machine learning solutions with performance metrics that are statistically similar to models built using real data.

Authors:

Gregory Dexter, Purdue University
Shaun Grannis, Indiana University
Suranga Kasthurirathne (Presenter)
Indiana University

Presentation Materials:

Tags