A Rotated Hyperbolic Wrapped Normal Distribution for Hierarchical Representation Learning

Authors: Seunghyuk Cho, Juyong Lee, Jaesik Park, Dongwoo Kim

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we first explain the two applications of the distribution defined on hyperbolic space: hyperbolic VAE and probabilistic hyperbolic word embedding model. We then conduct three different experiments to compare the performance of Ro WN with four baselines, including the Gaussian distribution in the Euclidean space, the isotropic HWN (Nagano et al., 2019), the diagonal HWN, and the full covariance HWN. We also provide an additional study on a variant of Ro WN with learnable rotation direction y in Algorithm 1, and the results are in Table 9.
Researcher Affiliation Academia Seunghyuk Cho1 Juyong Lee1 Jaesik Park1,2 Dongwoo Kim1,2 CSED POSTECH1 GSAI POSTECH2
Pseudocode Yes Algorithm 1 Sampling process with the rotated hyperbolic wrapped normal distribution Input Mean µ Ln, diagonal covariance matrix Σ Rn n Output Sample z Ln 1: x = [ 1, . . . , 0] Rn, y = µ1:/ µ1: is determined by the sign of µ0 2: R = I + (y T x x T y) + (y T x x T y)2/(1 + x, y ) 3: Rotate ˆΣ = RΣRT 4: Sample v N(0, ˆΣ) 5: return z = fµ(v)
Open Source Code Yes The code is available at https://github.com/ml-postech/Ro WN.
Open Datasets Yes We train a probabilistic word embedding model with Word Net dataset (Fellbaum, 1998), which consists of 82,115 nouns and 743,241 hypernymy relationships. ... The images of Breakout are collected by using a pre-trained Deep Q-network (Mnih et al., 2015) and divided into training set and test set with 90,000 and 10,000 images respectively.
Dataset Splits No The paper mentions a training set and a test set for the Atari Breakout dataset, but it does not explicitly provide specific details for a validation split for any of the datasets used, nor does it cite predefined splits that include validation data.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. The self-evaluation checklist also states this information is not included.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup Yes We set the latent dimension the same as the depth. We have initialized the embeddings from N(0, 0.01I), which are then moved to the Lorentz model using the exponential map. We use the learning rate warm-up proposed in (Nagano et al., 2019).