NeuMiss networks: differentiable programming for supervised learning with missing values.
Authors: Marine Le Morvan, Julie Josse, Thomas Moreau, Erwan Scornet, Gael Varoquaux
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 Empirical results Figure 2: Performance as a function of capacity across architectures Empirical evolution of the performance for a linear generating mechanism in MCAR settings. |
| Researcher Affiliation | Academia | 1 Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France 2 Université Paris-Saclay, CNRS/IN2P3, IJCLab, 91405 Orsay, France 3 CMAP, UMR7641, Ecole Polytechnique, IP Paris, 91128 Palaiseau, France 4 Mila, Mc Gill University, Montréal, Canada |
| Pseudocode | No | The paper describes mathematical derivations and network architecture, but does not contain any structured pseudocode or algorithm block. |
| Open Source Code | Yes | The code to reproduce the experiments is available in Git Hub 1. 1https://github.com/marineLM/NeuMiss |
| Open Datasets | No | The data are generated according to a multivariate Gaussian distribution, with a covariance matrix = UU > + diag( ), U 2 Rd d 2 , and the entries of U drawn from a standard normal distribution. The noise is a vector of entries drawn uniformly in to make full rank. The mean is drawn from a standard normal distribution. The response Y is generated as a linear function of the complete data X as in equation 1. |
| Dataset Splits | No | The paper mentions that the MLP's hidden layer width and Neu Miss network depth were 'chosen using a validation set', but it does not provide specific details on the size or percentage of this validation split. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, or cloud computing resources) used to run its experiments. |
| Software Dependencies | No | The paper mentions using 'Py Torch [24]' for implementation and 'scikit-learn’s [25] Iterative Imputer' but does not specify version numbers for these software dependencies. |
| Experiment Setup | Yes | The MLP is trained using ADAM and a batch size of 200. The learning rate is initialized to 10 2 d and decreased by a factor of 0.2 when the loss stops decreasing for 2 epochs. The training finishes when either the learning rate goes below 5 10 6 or the maximum number of epochs is reached. Neu Miss : ...optimized using stochastic gradient descent and a batch size of 10. The learning rate schedule and stopping criterion are the same as for the MLP. |