Latent Variable Modelling with Hyperbolic Normalizing Flows
Authors: Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, Will Hamilton
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our T C-flow and WHC-flow on three tasks: structured density estimation, graph reconstruction, and graph generation. Throughout our experiments, we rely on three main baselines. In Euclidean space, we use Gaussian latent variables and affine coupling flows (Dinh et al., 2017), denoted N and NC, respectively. In the Lorentz model, we use Wrapped Normal latent variables, H-VAE, as an analogous baseline (Nagano et al., 2019). Since all model parameters are defined on Euclidean tangent spaces, models can be trained with conventional optimizers like Adam (Kingma & Ba, 2014). Following previous work, we also consider the curvature K as a learnable parameter with a warmup of 10 epochs, and we clamp the max norm of vectors to 40 before any logarithmic or exponential map (Skopek et al., 2019). Appendix E contains details on model architectures and implementation details. |
| Researcher Affiliation | Academia | 1Mc Gill University 2Mila 3University of Toronto 4Vector Institute. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | Yes | 2https://github.com/joeybose/Hyperbolic NF |
| Open Datasets | Yes | We test the approaches on a branching diffusion process (BDP) and dynamically binarized MNIST (Mathieu et al., 2019; Skopek et al., 2019). |
| Dataset Splits | No | The paper mentions training and testing sets, and for structured density estimation, it states 'To estimate the log likelihood we perform importance sampling using 500 samples from the test set (Burda et al., 2015).' However, it does not provide specific details on how the dataset is split into training, validation, and test sets, nor does it explicitly define a validation split with percentages or counts for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'conventional optimizers like Adam (Kingma & Ba, 2014)' and 'GRev Nets (Liu et al., 2019a)', but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed to replicate the experiments. |
| Experiment Setup | Yes | Following previous work, we also consider the curvature K as a learnable parameter with a warmup of 10 epochs, and we clamp the max norm of vectors to 40 before any logarithmic or exponential map (Skopek et al., 2019). |