reproducibilityindex.ai

Learning Unnormalized Statistical Models via Compositional Optimization

Authors: Wei Jiang, Jiayu Qin, Lingyu Wu, Changyou Chen, Tianbao Yang, Lijun Zhang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We illustrate the better performance of our method on different tasks, namely, density estimation, out-of-distribution detection, and real image generation.
Researcher Affiliation	Academia	1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China 2 Department of Computer Science and Engineering, University at Buffalo, New York, USA 3Department of Computer Science and Engineering, Texas A&M University, College Station, USA. Correspondence to: Changyou Chen <changyou@buffalo.edu>, Tianbao Yang <tianbaoyang@tamu.edu>, Lijun Zhang <zhanglj@lamda.nju.edu.cn>.
Pseudocode	Yes	Algorithm 1 MECO Input: time step T, initial points (θ1, u1, v1) sequence {ηt, γt, βt} for time step t = 1 to T do Sampling zt from {x1, , xn} and ezt from q(x) Update estimator ut according to equation (3) Update estimator vt according to equation (4) Update the weight: θt+1 = θt ηtvt end for Choose τ uniformly at random from {1, . . . , T} Return θτ
Open Source Code	No	The paper does not provide a direct link to the source code for the methodology, nor does it explicitly state that the code is released or available.
Open Datasets	Yes	We choose CIFAR-10 (Krizhevsky, 2009) as the in-distribution data.
Dataset Splits	No	The paper mentions using training and testing data but does not explicitly provide details about validation splits or percentages for any of the datasets used.
Hardware Specification	Yes	Experiments on MNIST in Section 6.3 are trained on four NVIDIA Tesla V100 GPUs, and the training time is around 2.8 hours.
Software Dependencies	No	The paper mentions using `numpy` and `Adam` optimizer, but it does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	For our method, we set the parameter γ = 0.1 and β = 0.9. For MCMC training, the number of sampling steps is searched from the set {20, 50, 100} and we use Langevin dynamics (Welling & Teh, 2011) as the sampling approach. For all tasks, we tune the learning rates from {1e 1, 1e 2, 1e 3, 1e 4} and pick the best one.