Comparing Distributions by Measuring Differences that Affect Decision Making
Authors: Shengjia Zhao, Abhishek Sinha, Yutong He, Aidan Perreault, Jiaming Song, Stefano Ermon
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our approach to two-sample tests, and on various benchmarks, we achieve superior test power compared to competing methods. We demonstrate the effectiveness of H-divergence in two sample tests |
| Researcher Affiliation | Academia | Department of Computer Science Stanford University |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. Methodological steps are described in prose and mathematical formulations. |
| Open Source Code | Yes | The code to reproduce our experiments can be found here. [footnote] |
| Open Datasets | Yes | We follow Liu et al. (2020) and consider four datasets: Blob (Liu et al., 2020), HDGM (Liu et al., 2020), HIGGS (Adam-Bourdarios et al., 2014) and MNIST (Le Cun & Cortes, 2010). We use the NOAA database which contains daily weather from thousands of weather stations at different geographical locations. We obtain the crop yield dataset from (FAOSTAT et al., 2006) |
| Dataset Splits | Yes | We split each dataset into two equal partitions: a training set to tune hyper-parameters, and a validation set to compute the final test output. |
| Hardware Specification | No | The paper does not specify any particular hardware used for running the experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions implementing methods and using various models (e.g., mixture of Gaussian distributions, Parzen density estimator, Variational Autoencoder, Kernel Ridge regression), but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We choose φ(θ, λ) = θs+λs / 2 1/s for s > 1... We define l(x, a) as the negative log likelihood of x under distribution a, where a is in a certain model family A. We experiment with mixture of Gaussian distributions, Parzen density estimtor and Variational Autoencoder (Kingma & Welling, 2013). Our hyper-parameters consist of the best parameter s and also the best generative model family. We use α = 0.05 in all two-sample test experiments. Each permutation test uses 100 permutations, and we run each test 100 times to compute the test power |