Gradient Estimation for Binary Latent Variables via Gradient Variance Clipping
Authors: Russell Z. Kunes, Mingzhang Yin, Max Land, Doron Haviv, Dana Pe'er, Simon Tavaré
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we observe that UGC achieves the optimal value of the optimization objectives in toy experiments, discrete VAE training, and in a best subset selection problem. |
| Researcher Affiliation | Academia | 1Department of Statistics, Columbia University 2Computational and Systems Biology, Memorial Sloan Kettering Cancer Center 3Howard Hughes Medical Institute 4Irving Institute of Cancer Dynamics, Columbia University 5Warrington College of Business, University of Florida |
| Pseudocode | No | The paper describes algorithmic steps and equations, but does not include a formally labeled pseudocode block or algorithm. |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that the code for their methodology is available. |
| Open Datasets | Yes | We replicate the discrete VAE architecture and experimental setup on binarized Dynamic MNIST, Omniglot and Fashion MNIST datasets (Yin and Zhou 2019; Dong, Mnih, and Tucker 2020). |
| Dataset Splits | No | The paper mentions using standard datasets and replicating experimental setups from other papers, but it does not explicitly provide the specific training, validation, or test splits used within the text. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., GPU, CPU models, or cloud resources) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | p = 200, n = 60, Σ = I, |S| = 3, top: SNR = β β/σ2 = 3.8125, parameterization by ϕ = log(θ/(1 θ))bottom: SNR = β β/σ2 = 1.694. Parameterization by θ, with projected gradient descent onto [0, 1]. |