Simultaneous Missing Value Imputation and Structure Learning with Groups
Authors: Pablo Morales-Alvarez, Wenbo Gong, Angus Lamb, Simon Woodhead, Simon Peyton Jones, Nick Pawlowski, Miltiadis Allamanis, Cheng Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we conduct extensive experiments on synthetic, semi-synthetic, and real-world education data sets. |
| Researcher Affiliation | Collaboration | Pablo Morales-Alvarez University of Granada Wenbo Gong Microsoft Research Angus Lamb G-Research Simon Woodhead Eedi Simon Peyton Jones Epic Games Nick Pawlowski Microsoft Research Miltiadis Allamanis Google Cheng Zhang Microsoft Research |
| Pseudocode | Yes | Algorithm 1 Generative process |
| Open Source Code | Yes | We will provide the main model code in the supplemental material. The full running code will be released after acceptance. |
| Open Datasets | Yes | We evaluate our method using a benchmark in healthcare applications [53]. |
| Dataset Splits | No | For each simulated dataset, we simulate 5000 training and 1000 test samples. The train and test sets have 1000 and 500 patients, respectively. There is no explicit mention of a validation set or split percentages for it. |
| Hardware Specification | Yes | All experiments were conducted on NVIDIA V100 GPU. |
| Software Dependencies | No | The paper mentions general training details like the Adam optimizer with a learning rate of 0.001, but does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | We use Adam with learning rate of 0.001. We use a batch size of 128. For the synthetic and neuropathic pain dataset, the models are trained for 200 epochs and for the Eedi data, we train for 500 epochs. For the VISL model, the dimension of the latent variable for each group is 1. The GNN in the decoder has 3 layers, and we run 3 message passing steps. The MLPs are 2-layer with 64 hidden units for each layer. The DAG regulariser strength λ is 0.01 for the synthetic dataset and 0.001 for the Neuropathic Pain and Eedi dataset. |