How to deal with missing data in supervised deep learning?

Authors: Niels Bruun Ipsen, Pierre-Alexandre Mattei, Jes Frellsen

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now evaluate discriminative models trained using the sup MIWAE bound on a range of supervised learning tasks. Throughout the experiments the generative part of the model is pretrained using the MIWAE bound of Mattei & Frellsen (2019). The pretrained DLVM can be used to draw single imputations, referred to as MIWAE single imputation, or it can be used as the generative part in the sup MIWAE computational structure. Here, the generative part of the model is kept fixed while updating the discriminative part of the model according to the sup MIWAE bound.
Researcher Affiliation Collaboration Niels Bruun Ipsen nbip@dtu.dk Pierre-Alexandre Mattei pierre-alexandre.mattei@inria.fr Jes Frellsen jefr@dtu.dk Department of Applied Mathematics and Computer Science, Technical University of Denmark, Denmark Universit e Cˆote d Azur, Inria (Maasai team), Laboratoire J.A. Dieudonn e, CNRS, France Equal contribution
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes REPRODUCIBILITY STATEMENT Code for reproducing paper experiments is available at https://github.com/nbip/ sup MIWAE.
Open Datasets Yes We apply the sup MIWAE and single imputations methods to the MNIST dataset (Le Cun et al., 1998) and the fashion MNIST dataset (Xiao et al., 2017) with three different MCAR missing mechanisms... We now turn to regression in smaller and lower dimensional datasets from the UCI database (Dua & Graff, 2017)... In order to assess classification of natural images we now turn to the Street View House Numbers dataset (SVHN, Netzer et al., 2011).
Dataset Splits Yes The datasets are split randomly 20 times with 90% of the data in a training set and 10% in a test set. A validation set with 10% of the training data is used for early stopping and a batch size of 256 is used.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, or memory amounts) used for running its experiments. It makes general statements like 'on a GPU' but lacks specific models.
Software Dependencies No The paper mentions 'scikit-learn (Pedregosa et al., 2011)' but does not provide a specific version number for scikit-learn itself, nor does it list versions for other key software components like Python or deep learning frameworks.
Experiment Setup Yes All datasets consist of a training set with 3k records and validation and test sets with 1k records. An MCAR missing process is introduced in the horizontal coordinate, where each element becomes missing with probability m = 0.5. Further training details are in appendix E... In figure 3 imputations from different imputation models are shown along with the corresponding learnt decision surface... During training, K = 5 importance samples and a batch size of 128 are used.