NeuMiss networks: differentiable programming for supervised learning with missing values.

Authors: Marine Le Morvan, Julie Josse, Thomas Moreau, Erwan Scornet, Gael Varoquaux

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Empirical results Figure 2: Performance as a function of capacity across architectures Empirical evolution of the performance for a linear generating mechanism in MCAR settings.
Researcher Affiliation Academia 1 Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France 2 Université Paris-Saclay, CNRS/IN2P3, IJCLab, 91405 Orsay, France 3 CMAP, UMR7641, Ecole Polytechnique, IP Paris, 91128 Palaiseau, France 4 Mila, Mc Gill University, Montréal, Canada
Pseudocode No The paper describes mathematical derivations and network architecture, but does not contain any structured pseudocode or algorithm block.
Open Source Code Yes The code to reproduce the experiments is available in Git Hub 1. 1https://github.com/marineLM/NeuMiss
Open Datasets No The data are generated according to a multivariate Gaussian distribution, with a covariance matrix = UU > + diag( ), U 2 Rd d 2 , and the entries of U drawn from a standard normal distribution. The noise is a vector of entries drawn uniformly in to make full rank. The mean is drawn from a standard normal distribution. The response Y is generated as a linear function of the complete data X as in equation 1.
Dataset Splits No The paper mentions that the MLP's hidden layer width and Neu Miss network depth were 'chosen using a validation set', but it does not provide specific details on the size or percentage of this validation split.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or cloud computing resources) used to run its experiments.
Software Dependencies No The paper mentions using 'Py Torch [24]' for implementation and 'scikit-learn’s [25] Iterative Imputer' but does not specify version numbers for these software dependencies.
Experiment Setup Yes The MLP is trained using ADAM and a batch size of 200. The learning rate is initialized to 10 2 d and decreased by a factor of 0.2 when the loss stops decreasing for 2 epochs. The training finishes when either the learning rate goes below 5 10 6 or the maximum number of epochs is reached. Neu Miss : ...optimized using stochastic gradient descent and a batch size of 10. The learning rate schedule and stopping criterion are the same as for the MLP.