GAIN: Missing Data Imputation using Generative Adversarial Nets
Authors: Jinsung Yoon, James Jordon, Mihaela Schaar
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We tested our method on various datasets and found that GAIN significantly outperforms state-of-the-art imputation methods. In this section, we validate the performance of GAIN using multiple real-world datasets. In the first set of experiments we qualitatively analyze the properties of GAIN. In the second we quantitatively evaluate the imputation performance of GAIN using various UCI datasets (Lichman, 2013), giving comparisons with state-of-the-art imputation methods. |
| Researcher Affiliation | Academia | 1University of California, Los Angeles, CA, USA 2University of Oxford, UK 3Alan Turing Institute, UK. Correspondence to: Jinsung Yoon <jsyoon0823@gmail.com>. |
| Pseudocode | Yes | Algorithm 1 Pseudo-code of GAIN |
| Open Source Code | No | The paper does not provide an explicit link or statement about open-source code availability for the described methodology. |
| Open Datasets | Yes | We use five real-world datasets from UCI Machine Learning Repository (Lichman, 2013) (Breast, Spam, Letter, Credit, and News) to quantitatively evaluate the imputation performance of GAIN. |
| Dataset Splits | Yes | We conduct each experiment 10 times and within each experiment we use 5-cross validations. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | Details of hyper-parameter selection can be found in the Supplementary Materials. We first optimize the discriminator D with a fixed generator G using mini-batches of size k D. Second, we optimize the generator G using the newly updated discriminator D with mini-batches of size k G. G is then trained to minimize the weighted sum of the two losses as follows: LG(m(j), ˆm(j), b(j)) + αLM( x(j), ˆx(j)), where α is a hyper-parameter. |