reproducibilityindex.ai

Simultaneous Missing Value Imputation and Structure Learning with Groups

Authors: Pablo Morales-Alvarez, Wenbo Gong, Angus Lamb, Simon Woodhead, Simon Peyton Jones, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we conduct extensive experiments on synthetic, semi-synthetic, and real-world education data sets.
Researcher Affiliation	Collaboration	Pablo Morales-Alvarez University of Granada Wenbo Gong Microsoft Research Angus Lamb G-Research Simon Woodhead Eedi Simon Peyton Jones Epic Games Nick Pawlowski Microsoft Research Miltiadis Allamanis Google Cheng Zhang Microsoft Research
Pseudocode	Yes	Algorithm 1 Generative process
Open Source Code	Yes	We will provide the main model code in the supplemental material. The full running code will be released after acceptance.
Open Datasets	Yes	We evaluate our method using a benchmark in healthcare applications [53].
Dataset Splits	No	For each simulated dataset, we simulate 5000 training and 1000 test samples. The train and test sets have 1000 and 500 patients, respectively. There is no explicit mention of a validation set or split percentages for it.
Hardware Specification	Yes	All experiments were conducted on NVIDIA V100 GPU.
Software Dependencies	No	The paper mentions general training details like the Adam optimizer with a learning rate of 0.001, but does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We use Adam with learning rate of 0.001. We use a batch size of 128. For the synthetic and neuropathic pain dataset, the models are trained for 200 epochs and for the Eedi data, we train for 500 epochs. For the VISL model, the dimension of the latent variable for each group is 1. The GNN in the decoder has 3 layers, and we run 3 message passing steps. The MLPs are 2-layer with 64 hidden units for each layer. The DAG regulariser strength λ is 0.01 for the synthetic dataset and 0.001 for the Neuropathic Pain and Eedi dataset.