Riemannian Adaptive Optimization Methods

Authors: Gary Becigneul, Octavian-Eugen Ganea

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we show faster convergence and to a lower train loss value for Riemannian adaptive methods over their corresponding baselines on the realistic task of embedding the Word Net taxonomy in the Poincar e ball.
Researcher Affiliation Academia Gary B ecigneul, Octavian-Eugen Ganea Department of Computer Science ETH Z urich, Switzerland
Pseudocode Yes Figure 1: Comparison of the Riemannian and Euclidean versions of AMSGRAD. (a) RAMSGRAD in M1 Mn. (b) AMSGRAD in Rn.
Open Source Code No The paper does not provide an explicit statement or link for open-source code availability.
Open Datasets Yes For this, we follow (Nickel & Kiela, 2017) and embed the transitive closure of the Word Net noun hierarchy (Miller et al., 1990) in the n-dimensional Poincar e model Dn of hyperbolic geometry
Dataset Splits Yes For link prediction we sample a validation set of 2% edges from the set of transitive closure edges that contain no leaf node or root.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes For all methods we use the same burn-in phase described in (Nickel & Kiela, 2017) for 20 epochs, with a fixed learning rate of 0.03 and using RSGD with retraction as explained in Sec. 2.2. ...We always use β1 = 0.9 and β2 = 0.999 for these methods as these achieved the lowest training loss.