reproducibilityindex.ai

Gradient-Guided Importance Sampling for Learning Binary Energy-Based Models

Authors: Meng Liu, Haoran Liu, Shuiwang Ji

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform experiments on density modeling over synthetic discrete data, graph generation, and training Ising models to evaluate our proposed method. The experimental results demonstrate that our method can significantly alleviate the limitations of ratio matching, perform more effectively in practice, and scale to high-dimensional problems.
Researcher Affiliation	Academia	Meng Liu, Haoran Liu, Shuiwang Ji Department of Computer Science & Engineering Texas A&M University College Station, TX 77843, USA {mengliu,liuhr99,sji}@tamu.edu
Pseudocode	Yes	Algorithm 1 Ratio Matching with Gradient-Guided Importance Sampling (RMw GGIS) 1: Input: Observed dataset D = x(m) \|D\| m=1, parameterized energy function Eθ( ), number of samples s for Monte Carlo estimation with importance sampling 2: for x D do Batch training is applied in practice 3: Compute Eθ(x) 4: Compute x Eθ(x) 5: Compute the proposal distribution en (x i) Eq. (10) 6: Sample s terms, denoted as x(1) i , , x(s) i, according to en (x i) 7: Compute JRM(θ, x) b en Eq. (6) (or Eq. (11)) 8: Update θ based on θJRM(θ, x) b en 9: end for
Open Source Code	Yes	Our implementation is available at https://github.com/divelab/RMwGGIS.
Open Datasets	Yes	We further evaluate our RMw GGIS on graph generation using the Ego-small dataset (You et al., 2018). ... We firstly draw 2D data points from 2D continuous space according to some unknown distribution ˆp, which can be naturally visualized. Then, we convert each 2D data point ˆx R2 to a discrete data point x {0, 1}d... We follow the experimental setting of Dai et al. (2020) for density modeling on synthetic discrete data.
Dataset Splits	No	For graph generation, the paper states "80% of the graphs are used for training and the rest for testing." but does not explicitly mention a validation split.
Hardware Specification	No	The paper mentions general hardware like "modern GPUs with limited memory" but does not specify any exact GPU models, CPU models, or other detailed hardware specifications used for experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers for reproducibility, only mentioning libraries or optimizers by name without version details.
Experiment Setup	Yes	The energy function is parameterized by an MLP with the Swish (Ramachandran et al., 2017) activation and 256 hidden dimensions. The number of samples s, involved in the objective functions of our RMw GGIS method, is set to be 10. ... We use Adam optimizer (Kingma & Ba, 2015) with a learning rate of 1e-4 and a batch size of 100. ℓ1 penalty with strength 0.01 is used to encourage sparsity.