End-to-End Learning for the Deep Multivariate Probit Model

Authors: Di Chen, Yexiang Xue, Carla Gomes

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present both theoretical and empirical analysis of the convergence behavior of DMVP s sampling process with respect to the resolution of the correlation structure. We provide convergence guarantees for DMVP and our empirical analysis demonstrates the advantages of DMVP s sampling compared with standard MCMC-based methods. We also show that when applied to multi-entity modelling problems, which are natural DMVP applications, DMVP trains faster than classical MVP, by at least an order of magnitude, captures rich correlations among entities, and further improves the joint likelihood of entities compared with several competitive models.
Researcher Affiliation Academia 1Computer Science Department, Cornell University, Ithaca, NY, US 14850. Correspondence to: Di Chen <di@cs.cornell.edu>.
Pseudocode No The paper includes a diagram (Figure 1) illustrating the learning framework, but it does not provide any formal pseudocode or algorithm blocks.
Open Source Code Yes Code to reproduce the experiments can be found at https://bitbucket.org/Di Chen9412/icml2018 dmvp
Open Datasets Yes e Bird is a crowd-sourced bird observation dataset collected from the successful citizen science project e Bird (Munson et al., 2012). Amazon is the Amazon rainforest landscape satellite image dataset collected for Amazon rainforest landscape analysis. NUS-WIDE-LITE is a light version of the NUS-WIDE datasets collected by the National University of Singapore (Chua et al., July 8-10, 2009).
Dataset Splits Yes We randomly split the datasets into three parts for training, validation, and testing. The details of the three datasets are listed in table 1. #Training Set 40759 #Validation Set 5095 #Test Set 5095 (e Bird data)
Hardware Specification Yes All the training and testing process of our DMVP and other baseline models, which are compatible with the GPUs, are performed on one NVIDIA Quadro P4000 GPU with 8GB memory. The training and testing process for the DMSE model is performed on Intel(R) Core(TM) i7-7700K CPU@4.20Gz with 8 cores.
Software Dependencies No The paper mentions machine learning packages like 'Tensorflow or Pytouch' and 'Adam optimizer' but does not provide specific version numbers for these software components.
Experiment Setup Yes The whole training process lasts 200 epochs, using the batch size of 128, Adam optimizer (Kingma & Ba, 2014) with learning rate of 10 4 and utilizing batch normalization (Ioffe & Szegedy, 2015), 0.5 dropout rate (Srivastava et al., 2014) and early stopping to accelerate the training process and to prevent overfitting for not only DMVP but all baseline models.