Riemannian Residual Neural Networks

Authors: Isay Katsman, Eric Chen, Sidhanth Holalkere, Anna Asch, Aaron Lou, Ser Nam Lim, Christopher M. De Sa

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We find that our Riemannian Res Nets mirror these desirable properties: when compared to existing manifold neural networks designed to learn over hyperbolic space and the manifold of symmetric positive definite matrices, we outperform both kinds of networks in terms of relevant testing metrics and training dynamics.
Researcher Affiliation Academia Isay Katsman Yale University isay.katsman@yale.edu Eric M. Chen , Sidhanth Holalkere Cornell University {emc348, sh844}@cornell.edu Anna Asch Cornell University aca89@cornell.edu Aaron Lou Stanford University aaronlou@stanford.edu Ser-Nam Lim University of Central Florida sernam@ucf.edu Christopher De Sa Cornell University cdesa@cs.cornell.edu
Pseudocode No No section or figure explicitly labeled 'Pseudocode' or 'Algorithm' was found.
Open Source Code No The paper does not contain an explicit statement about releasing the source code or a link to a code repository for the methodology described.
Open Datasets Yes We test on the very hyperbolic Disease (δ = 0) [8] and Airport (δ = 1) [8] datasets. We also test on the considerably less hyperbolic Pub Med (δ = 3.5) [47] and Co RA (δ = 11) [46] datasets.
Dataset Splits No The paper mentions 'validation accuracies' in Table 2, but does not explicitly provide the specific percentages or counts for training, validation, and test splits for the datasets used.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments (e.g., specific GPU/CPU models, memory, or cloud instance types).
Software Dependencies No The paper does not specify software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'Python 3.8').
Experiment Setup Yes Appendix C.2: Model and Training Details: We use Adam [30] with a learning rate of 0.001 and weight decay 0.0001. We found that the MLP with 2 layers and a hidden dimension of 256 works best. We use a batch size of 256.