What’s a good imputation to predict with missing values?

Authors: Marine Le Morvan, Julie Josse, Erwan Scornet, Gael Varoquaux

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments confirm that joint imputation and regression through Neu Miss is better than various two step procedures in our experiments with finite number of samples.
Researcher Affiliation Academia 1 Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France 2 Université Paris-Saclay, CNRS/IN2P3, IJCLab, 91405 Orsay, France 3 CMAP, UMR7641, Ecole Polytechnique, IP Paris, 91128 Palaiseau, France 4 Inria Sophia-Antipolis, Montpellier, France
Pseudocode No No structured pseudocode or algorithm blocks were found.
Open Source Code Yes The code for all experiments is available at https://github.com/marine LM/Impute_then_ Regress.
Open Datasets No The paper states: 'Data generation The data X 2 Rn d are generated according to a multivariate Gaussian distribution N(µ, ) where the mean is drawn from a standard Gaussian and the covariance is generated as = BB> + D.' This indicates simulated data, not a publicly available dataset.
Dataset Splits Yes The experiments use training sets of size n = 100 000 and validation and test sets of size n = 10 000. A validation set is used to choose MLPs depth (1, 2 or 5), width (1d, 5d or 10d), initial learning rate (ranging from 5.10 4 to 10 2) and weight decay (ranging from 10 6 to 10 3).
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments were provided.
Software Dependencies No The paper mentions software like 'Py Torch' and 'Scikit-learn' but does not provide specific version numbers for these or any other ancillary software components.
Experiment Setup Yes A validation set is used to choose MLPs depth (1, 2 or 5), width (1d, 5d or 10d), initial learning rate (ranging from 5.10 4 to 10 2) and weight decay (ranging from 10 6 to 10 3). Adam is used with an adaptive learning rate: the learning rate is divided by 5 each time 10 consecutive epochs fail to decrease the training loss by at least 1e-4. Early stopping is triggered when the validation score does not improve by at least 1e-4 for 12 consecutive epochs. The batch size is set to 100, and Re LUs are used as activation functions. Finally for Neu Miss the depth is set to 20.