Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Implicit Reparameterization Gradients
Authors: Mikhail Figurnov, Shakir Mohamed, Andriy Mnih
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that the proposed approach is faster and more accurate than the existing gradient estimators for these distributions. |
| Researcher Affiliation | Industry | Michael Figurnov Shakir Mohamed Andriy Mnih Deep Mind, London, UK EMAIL |
| Pseudocode | Yes | Table 1: Comparison of the two reparameterization types. While they provide the same result, the implicit version is easier to implement for distributions such as Gamma because it does not require inverting the standardization function Sφ(z). Forward pass Sample " ~ q(") Sample z ~ qφ(z) Set z = S−1 φ (") Backward pass Set rφz = rφS−1 φ (") Set rφf(z) = rzf(z)rφz |
| Open Source Code | Yes | Implicit reparameterization for Gamma, Student s t, Beta, Dirichlet and von Mises distributions is available in Tensor Flow Probability [11]. |
| Open Datasets | Yes | We use the 20 Newsgroups (11,200 documents, 2,000-word vocabulary) and RCV1 [29] (800,000 documents, 10,000-word vocabulary) datasets with the same preprocessing as in [47]. |
| Dataset Splits | No | The paper mentions using 20 Newsgroups, RCV1, and MNIST datasets but does not explicitly provide training, validation, or test dataset splits (e.g., percentages or counts) within its text. |
| Hardware Specification | No | No specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments are provided. The paper only mentions using 'Tensor Flow [1] for our experiments'. |
| Software Dependencies | No | The paper mentions software like TensorFlow, TensorFlow Probability, C++, and PyTorch but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | For Gamma, we use a sparse Gamma(0.3, 0.3) prior and a bell-shaped prior Gamma(10, 10). For Beta and von Mises, instead of a sparse prior we choose a uniform prior over the corresponding domain. |