reproducibilityindex.ai

Any-scale Balanced Samplers for Discrete Space

Authors: Haoran Sun, Bo Dai, Charles Sutton, Dale Schuurmans, Hanjun Dai

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On various synthetic and real distributions, the proposed sampler substantially outperforms existing approaches. We conducted an experimental evaluation on three types of target distributions: 1) quadratic synthetic distributions, 2) non-quadratic synthetic distributions, and 3) real distributions.
Researcher Affiliation	Collaboration	Haoran Sun hsun349@gatech.edu Bo Dai bodai@google.com Charles Sutton charlessutton@google.com Dale Schuurmans schuurmans@google.com Hanjun Dai hadai@google.com Work done during an internship at Google. Georgia Tech Google Research, Brain Team University of Alberta
Pseudocode	Yes	Algorithm 1: AB sampling algorithm; Algorithm 2: AB M-H step; Algorithm 3: Adapting Algorithm; Algorithm 4: Adapting Algorithm Block
Open Source Code	No	No explicit statement or link to open-source code for the methodology is provided.
Open Datasets	Yes	For real distributions, we compare against baseline samplers on challenging inference problems in deep energy based models trained on MNIST, Omniglot, and Caltech datasets.
Dataset Splits	No	The paper mentions 'T=100,000 steps, with T1=20,000 burn-in steps to make sure the chain mixes.' which refers to MCMC chain length and burn-in, not explicit dataset splits (train/validation/test) with percentages or counts. For EBMs, it mentions a training framework and number of steps to obtain samples, but not explicit dataset splits.
Hardware Specification	Yes	All experiments are running on a virtual machine with CPU: Intel Haswell, GPU: 4 Nvidia V100, System: Debian 10.
Software Dependencies	Yes	In this work, we use academia version of Mosek (Ap S, 2019).
Experiment Setup	Yes	Input: Initial σ = 0.1, α = 0.5, W = 0, D = 0; initial x0... For each setting and sampler, we run 100 chains for T=100,000 steps, with T1=20,000 burn-in steps to make sure the chain mixes. Algorithm 3: Adapting Algorithm Input: initial σ = 0.1, α = 0.5, update rate γ = 0.2, decay rate β = 0.9, initial state x0, buffer size N = 100.