Skip to content

Add TabGAN - synthetic tabular data generation with GANs, Diffusion, and LLMs#83

Merged
krzjoa merged 1 commit intokrzjoa:masterfrom
Diyago:add-tabgan
Apr 13, 2026
Merged

Add TabGAN - synthetic tabular data generation with GANs, Diffusion, and LLMs#83
krzjoa merged 1 commit intokrzjoa:masterfrom
Diyago:add-tabgan

Conversation

@Diyago
Copy link
Copy Markdown
Contributor

@Diyago Diyago commented Mar 28, 2026

Add TabGAN — Synthetic Tabular Data Generation

TabGAN is a Python library for generating high-quality synthetic tabular data using multiple generative approaches through a unified API:

  • CTGAN (Conditional Tabular GAN) for mixed data types
  • ForestDiffusion (tree-based diffusion) for structured data
  • GReaT (Large Language Models) for semantic dependencies

Key Features

  • Unified API across GANs, Diffusion Models, and LLMs
  • Adversarial filtering ensures distribution consistency
  • Privacy metrics (DCR, NNDR, membership inference)
  • Constraint enforcement (range, uniqueness, formula, regex)
  • HTML quality reports with distribution comparisons
  • sklearn TabGANTransformer for Pipeline integration
  • 100K+ PyPI downloads, 115 tests, Apache 2.0

Paper: Tabular GANs for uneven distribution

@krzjoa krzjoa merged commit e6c1534 into krzjoa:master Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants