Identifiable Generative models for Missing Not at Random Data Imputation

Authors: Chao Ma, Cheng Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study the empirical performance of the proposed algorithm of Section 4 with both synthetic data (Section 6.1) and two real-world datasets with music recommendation (Section 6.2) and personalized education (Section 6.3) .
Researcher Affiliation Collaboration Chao Ma1,2 Cheng Zhang2 1University of Cambridge 2Microsoft Research Cambridge cm905@cam.ac.uk cheng.zhang@microsoft.com
Pseudocode No The paper describes the GINA algorithm textually but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The code is released at https://github.com/ microsoft/project-azua.
Open Datasets Yes We apply our models to recommendation systems on Yahoo! R3 dataset [30, 60] for user-song ratings... Finally, we apply our methods to the Eedi education dataset [61]...
Dataset Splits Yes In this experiment, we randomly split the data in a 90% train/ 10% test/ 10% validation ratio, and train our models on the response outcome data.
Hardware Specification No The paper does not provide specific details on the hardware used for running the experiments, such as CPU or GPU models.
Software Dependencies No The paper does not specify the version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We use a 3-layer neural network (512, 512, 20) with ReLU activation for the encoder and decoder, with latent dimension 20. We use a 2-layer neural network for the missingness prediction network with dimension (512, 1). We train with batch size 128, for 500 epochs with Adam optimizer with learning rate 0.001.