Transformation Autoregressive Networks
Authors: Junier Oliva, Avinava Dubey, Manzil Zaheer, Barnabas Poczos, Ruslan Salakhutdinov, Eric Xing, Jeff Schneider
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experiments We now present empirical studies for our TAN framework in order to establish (i) the superiority of TANs over one-prong approaches (Sec. 4.1), (ii) that TANs are accurate on real world datasets (Sec. 4.2), (iii) the importance of various components of TANs, (iv) that TANs are easily amenable to various tasks (Sec. 4.4), such as learning a parametric family of distributions and being able to generalize over unseen parameter values (Sec. 4.5). |
| Researcher Affiliation | Academia | 1Computer Science Department, University of North Carolina, Chapel Hill, NC 27599 (Work completed while at CMU.) 2Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15123. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1See https://github.com/lupalab/tan. |
| Open Datasets | Yes | We carefully followed (Papamakarios et al., 2017) and code (MAF Git Repository) to ensure that we operated over the same instances and covariates for each of the datasets considered in (Papamakarios et al., 2017). Specifically we performed unconditional density estimation on four datasets from UCI machine learning repository2: power: d=6; N=2,049,280 GAS d=8; N=1,052,065 HEPMASS d=21; N=525,123 MINIBOONE d=43; N=36,488 BSDS300 d=63; N=1,300,000... 2http://archive.ics.uci.edu/ml/ |
| Dataset Splits | Yes | After training, the best iteration according to the validation set loss was used to produce the test set results. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | Models were implemented in Tensorflow (Abadi et al., 2016). While TensorFlow is mentioned, no specific version number for it or other software libraries is provided. |
| Experiment Setup | Yes | We take the mixture models of conditionals (2) to be mixtures of 40 Gaussians. We optimize all models using the Adam Optimizer (Kingma & Ba, 2014) with an initial learning rate of 0.005. Training consisted of 30000 iterations, with mini-batches of size 256. The learning rate was decreased by a factor of 0.1, or 0.5 (chosen via a validation set) every 5000 iterations. Gradient clipping with a norm of 1 was used. |