reproducibilityindex.ai

What’s a good imputation to predict with missing values?

Authors: Marine Le Morvan, Julie Josse, Erwan Scornet, Gael Varoquaux

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments conﬁrm that joint imputation and regression through Neu Miss is better than various two step procedures in our experiments with ﬁnite number of samples.
Researcher Affiliation	Academia	1 Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France 2 Université Paris-Saclay, CNRS/IN2P3, IJCLab, 91405 Orsay, France 3 CMAP, UMR7641, Ecole Polytechnique, IP Paris, 91128 Palaiseau, France 4 Inria Sophia-Antipolis, Montpellier, France
Pseudocode	No	No structured pseudocode or algorithm blocks were found.
Open Source Code	Yes	The code for all experiments is available at https://github.com/marine LM/Impute_then_ Regress.
Open Datasets	No	The paper states: 'Data generation The data X 2 Rn d are generated according to a multivariate Gaussian distribution N(µ, ) where the mean is drawn from a standard Gaussian and the covariance is generated as = BB> + D.' This indicates simulated data, not a publicly available dataset.
Dataset Splits	Yes	The experiments use training sets of size n = 100 000 and validation and test sets of size n = 10 000. A validation set is used to choose MLPs depth (1, 2 or 5), width (1d, 5d or 10d), initial learning rate (ranging from 5.10 4 to 10 2) and weight decay (ranging from 10 6 to 10 3).
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments were provided.
Software Dependencies	No	The paper mentions software like 'Py Torch' and 'Scikit-learn' but does not provide specific version numbers for these or any other ancillary software components.
Experiment Setup	Yes	A validation set is used to choose MLPs depth (1, 2 or 5), width (1d, 5d or 10d), initial learning rate (ranging from 5.10 4 to 10 2) and weight decay (ranging from 10 6 to 10 3). Adam is used with an adaptive learning rate: the learning rate is divided by 5 each time 10 consecutive epochs fail to decrease the training loss by at least 1e-4. Early stopping is triggered when the validation score does not improve by at least 1e-4 for 12 consecutive epochs. The batch size is set to 100, and Re LUs are used as activation functions. Finally for Neu Miss the depth is set to 20.