Latent Variable Modelling with Hyperbolic Normalizing Flows

Authors: Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, Will Hamilton

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our T C-flow and WHC-flow on three tasks: structured density estimation, graph reconstruction, and graph generation. Throughout our experiments, we rely on three main baselines. In Euclidean space, we use Gaussian latent variables and affine coupling flows (Dinh et al., 2017), denoted N and NC, respectively. In the Lorentz model, we use Wrapped Normal latent variables, H-VAE, as an analogous baseline (Nagano et al., 2019). Since all model parameters are defined on Euclidean tangent spaces, models can be trained with conventional optimizers like Adam (Kingma & Ba, 2014). Following previous work, we also consider the curvature K as a learnable parameter with a warmup of 10 epochs, and we clamp the max norm of vectors to 40 before any logarithmic or exponential map (Skopek et al., 2019). Appendix E contains details on model architectures and implementation details.
Researcher Affiliation Academia 1Mc Gill University 2Mila 3University of Toronto 4Vector Institute.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code Yes 2https://github.com/joeybose/Hyperbolic NF
Open Datasets Yes We test the approaches on a branching diffusion process (BDP) and dynamically binarized MNIST (Mathieu et al., 2019; Skopek et al., 2019).
Dataset Splits No The paper mentions training and testing sets, and for structured density estimation, it states 'To estimate the log likelihood we perform importance sampling using 500 samples from the test set (Burda et al., 2015).' However, it does not provide specific details on how the dataset is split into training, validation, and test sets, nor does it explicitly define a validation split with percentages or counts for reproducibility.
Hardware Specification No The paper does not provide any specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions using 'conventional optimizers like Adam (Kingma & Ba, 2014)' and 'GRev Nets (Liu et al., 2019a)', but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) needed to replicate the experiments.
Experiment Setup Yes Following previous work, we also consider the curvature K as a learnable parameter with a warmup of 10 epochs, and we clamp the max norm of vectors to 40 before any logarithmic or exponential map (Skopek et al., 2019).