Amortized Variational Inference for Simple Hierarchical Models
Authors: Abhinav Agrawal, Justin Domke
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate these methods on a synthetic model where exact inference is possible, and on a user-preference model for the Movie Lens dataset with 162K users who make 25M ratings of different movies. |
| Researcher Affiliation | Academia | Abhinav Agrawal College of Information and Computer Science Univeristy Of Massachusetts Amherst aagrawal@cs.umass.edu Justin Domke College of Information and Computer Science Univeristy Of Massachusetts Amherst domke@cs.umass.edu |
| Pseudocode | Yes | Figure 3: Pseudo codes for ELBO estimation with different variational methods; Figure 4: Psuedocode for netu for locally i.i.d symmetric HBD. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code or a link to a repository containing the implementation of the described methodology. |
| Open Datasets | Yes | We test our method on the Movie Lens25M [14], a dataset of 25 million movie ratings for over 62,000 movies, rated by 162,000 users, along with a set of features (tag relevance scores [38]) for each movie. |
| Dataset Splits | No | We used a train-test split such that, for each user, one-tenth of the ratings are in the test set. This gives us 18M ratings for training (and 2M ratings for testing.) No explicit mention of a separate validation split was found. |
| Hardware Specification | No | The paper mentions 'GPU hardware' for efficiency, but does not provide specific details such as GPU models, CPU types, or memory configurations used for the experiments. |
| Software Dependencies | No | The paper mentions using Jax for autodifferentiation and numpy for numerical operations, but it does not specify version numbers for these software dependencies or other key libraries. |
| Experiment Setup | Yes | All parameters were initialized with 0.01 random noise. For the Movie Lens experiments, we use a batch size of 200 users. Adam optimizer with default parameters (β1 = 0.9, β2 = 0.999, = 10−8). Learning rate = 10−3. For the MovieLens dataset, we train for 200,000 iterations. For the synthetic data, we train for 100,000 iterations. Amortization networks were two layer MLPs with 100 hidden units per layer with ELU non-linearity. |