How to deal with missing data in supervised deep learning?
Authors: Niels Bruun Ipsen, Pierre-Alexandre Mattei, Jes Frellsen
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now evaluate discriminative models trained using the sup MIWAE bound on a range of supervised learning tasks. Throughout the experiments the generative part of the model is pretrained using the MIWAE bound of Mattei & Frellsen (2019). The pretrained DLVM can be used to draw single imputations, referred to as MIWAE single imputation, or it can be used as the generative part in the sup MIWAE computational structure. Here, the generative part of the model is kept fixed while updating the discriminative part of the model according to the sup MIWAE bound. |
| Researcher Affiliation | Collaboration | Niels Bruun Ipsen nbip@dtu.dk Pierre-Alexandre Mattei pierre-alexandre.mattei@inria.fr Jes Frellsen jefr@dtu.dk Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark Universit e Cˆote d Azur, Inria (Maasai team), Laboratoire J.A. Dieudonn e, CNRS, France Equal contribution |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | REPRODUCIBILITY STATEMENT Code for reproducing paper experiments is available at https://github.com/nbip/ sup MIWAE. |
| Open Datasets | Yes | We apply the sup MIWAE and single imputations methods to the MNIST dataset (Le Cun et al., 1998) and the fashion MNIST dataset (Xiao et al., 2017) with three different MCAR missing mechanisms... We now turn to regression in smaller and lower dimensional datasets from the UCI database (Dua & Graff, 2017)... In order to assess classification of natural images we now turn to the Street View House Numbers dataset (SVHN, Netzer et al., 2011). |
| Dataset Splits | Yes | The datasets are split randomly 20 times with 90% of the data in a training set and 10% in a test set. A validation set with 10% of the training data is used for early stopping and a batch size of 256 is used. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, or memory amounts) used for running its experiments. It makes general statements like 'on a GPU' but lacks specific models. |
| Software Dependencies | No | The paper mentions 'scikit-learn (Pedregosa et al., 2011)' but does not provide a specific version number for scikit-learn itself, nor does it list versions for other key software components like Python or deep learning frameworks. |
| Experiment Setup | Yes | All datasets consist of a training set with 3k records and validation and test sets with 1k records. An MCAR missing process is introduced in the horizontal coordinate, where each element becomes missing with probability m = 0.5. Further training details are in appendix E... In figure 3 imputations from different imputation models are shown along with the corresponding learnt decision surface... During training, K = 5 importance samples and a batch size of 128 are used. |