liu.seSearch for publications in DiVA
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bt-GAN: Generating Fair Synthetic Healthdata via Bias-transforming Generative Adversarial Networks
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering. (Reasoning and Learning Lab)ORCID iD: 0000-0002-4302-9327
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering. (Reasoning and Learning Lab)ORCID iD: 0000-0001-5307-997X
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering. (Reasoning and Learning Lab)ORCID iD: 0000-0002-9240-4605
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems. Linköping University, Faculty of Science & Engineering. (Reasoning and Learning Lab)ORCID iD: 0000-0002-9595-2471
2024 (English)In: The journal of artificial intelligence research, ISSN 1076-9757, E-ISSN 1943-5037, Vol. 79, p. 1313-1341Article in journal (Refereed) Published
Abstract [en]

Synthetic data generation offers a promising solution to enhance the usefulness of Electronic Healthcare Records (EHR) by generating realistic de-identified data. However, the existing literature primarily focuses on the quality of synthetic health data, neglecting the crucial aspect of fairness in downstream predictions. Consequently, models trained on synthetic EHR have faced criticism for producing biased outcomes in target tasks. These biases can arise from either spurious correlations between features or the failure of models to accurately represent sub-groups. To address these concerns, we present Bias-transforming Generative Adversarial Networks (Bt-GAN), a GAN-based synthetic data generator specifically designed for the healthcare domain. In order to tackle spurious correlations (i), we propose an information-constrained Data Generation Process (DGP) that enables the generator to learn a fair deterministic transformation based on a well-defined notion of algorithmic fairness. To overcome the challenge of capturing exact sub-group representations (ii), we incentivize the generator to preserve sub-group densities through score-based weighted sampling. This approach compels the generator to learn from underrepresented regions of the data manifold. To evaluate the effectiveness of our proposed method, we conduct extensive experiments using the Medical Information Mart for Intensive Care (MIMIC-III) database. Our results demonstrate that Bt-GAN achieves state-of-the-art accuracy while significantly improving fairness and minimizing bias amplification. Furthermore, we perform an in-depth explainability analysis to provide additional evidence supporting the validity of our study. In conclusion, our research introduces a novel and professional approach to addressing the limitations of synthetic data generation in the healthcare domain. By incorporating fairness considerations and leveraging advanced techniques such as GANs, we pave the way for more reliable and unbiased predictions in healthcare applications.

Place, publisher, year, edition, pages
AAAI Press, 2024. Vol. 79, p. 1313-1341
Keywords [en]
Fair data generation, Trustworthy AI, Synthetic data generation, MIMIC-III, EHR
National Category
Computer Systems
Identifiers
URN: urn:nbn:se:liu:diva-203151DOI: 10.1613/jair.1.15317ISI: 001218386100001OAI: oai:DiVA.org:liu-203151DiVA, id: diva2:1855168
Note

Funding Agencies|Knut and Alice Wallenberg Foundation; ELLIIT Excellence Center at Linkoeping-Lund for Information Technology; TAILOR-an EU project

Available from: 2024-04-30 Created: 2024-04-30 Last updated: 2025-03-30Bibliographically approved

Open Access in DiVA

fulltext(1508 kB)57 downloads
File information
File name FULLTEXT02.pdfFile size 1508 kBChecksum SHA-512
2982dfbb194ae0aa499a4a33a1a6bf39d59eeae86b27611158a56ac5475a96fc507c8c0f92232baf5f1516f224c66720074dbee3f96f1d5daa2936a95bacb81c
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Ramachandranpillai, ResmiSikder, Md FahimBergström, DavidHeintz, Fredrik

Search in DiVA

By author/editor
Ramachandranpillai, ResmiSikder, Md FahimBergström, DavidHeintz, Fredrik
By organisation
Artificial Intelligence and Integrated Computer SystemsFaculty of Science & Engineering
In the same journal
The journal of artificial intelligence research
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 57 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 282 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf