reproducibilityindex.ai

Trivializations for Gradient-Based Optimization on Manifolds

Authors: Mario Lezcano Casado

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we assess the effectiveness of dynamic trivializations (DTRIV) in the context of orthogonal optimization. We test the framework with the basis changed every K = 1, 100, steps. We compare it against the most performant previous approaches presented for this task in the context of orthogonal optimization and a vanilla LSTM. These approaches are orthogonal exponential trivialization [EXPRNN Lezcano-Casado and Martínez-Rubio, 2019], orthogonal and unitary Cayley trivializations [SCORNN / SCURNN Helfrich et al., 2018, Maduranga et al., 2018], and Riemannian gradient descent [RGD Wisdom et al., 2016]. Table 1: Best test accuracy at MNIST and P-MNIST. Table 2: Test MSE at the end of the epoch with the lowest validation MSE for the TIMIT task.
Researcher Affiliation	Academia	Mario Lezcano-Casado Department of Mathematics University of Oxford Oxford, mario.lezcanocasado@maths.ox.ac.uk
Pseudocode	Yes	Algorithm 5.1 (Dynamic trivialization through retractions). Given a retraction r, an integer K > 0 or K = , and a starting point p0, the dynamic trivialization induced by r is deﬁned as the sequence of problems indexed by i = 0, 1, . . . min y Tpi M f(rpi(y)) where pi+1 := rpi(yi,K) M, and yi,k Tpi M for k = 1, . . . , K, is a sequence of approximations given by a Euclidean optimization algorithm e.g., SGD, ADAM, ADAGRAD, RMSPROP, . . . applied to the i-th problem with starting point yi,0 = 0. We say that pi is the basis at step i.
Open Source Code	Yes	An implementation can be found at: https://github.com/Lezcano/exp RNN
Open Datasets	Yes	MNIST dataset [Le Cun and Cortes, 2010] and TIMIT dataset [S Garofolo et al., 1992]
Dataset Splits	No	The paper mentions using well-known datasets (MNIST, TIMIT) and refers to a 'validation MSE' but does not explicitly provide the specific percentages or sample counts for training, validation, and test splits in the main text.
Hardware Specification	No	The paper does not provide specific details on the hardware used for experiments, such as CPU or GPU models, memory, or cloud computing specifications.
Software Dependencies	No	The paper mentions optimization algorithms like ADAM, ADAGRAD, and RMSPROP, but does not specify any software libraries, frameworks, or their version numbers used for implementation.
Experiment Setup	No	The paper states: 'We detail all the hyperparameters and set-up in Appendix F.', indicating that specific experimental setup details are not provided in the main text.