Riemannian Residual Neural Networks
Authors: Isay Katsman, Eric Chen, Sidhanth Holalkere, Anna Asch, Aaron Lou, Ser Nam Lim, Christopher M. De Sa
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We find that our Riemannian Res Nets mirror these desirable properties: when compared to existing manifold neural networks designed to learn over hyperbolic space and the manifold of symmetric positive definite matrices, we outperform both kinds of networks in terms of relevant testing metrics and training dynamics. |
| Researcher Affiliation | Academia | Isay Katsman Yale University isay.katsman@yale.edu Eric M. Chen , Sidhanth Holalkere Cornell University {emc348, sh844}@cornell.edu Anna Asch Cornell University aca89@cornell.edu Aaron Lou Stanford University aaronlou@stanford.edu Ser-Nam Lim University of Central Florida sernam@ucf.edu Christopher De Sa Cornell University cdesa@cs.cornell.edu |
| Pseudocode | No | No section or figure explicitly labeled 'Pseudocode' or 'Algorithm' was found. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We test on the very hyperbolic Disease (δ = 0) [8] and Airport (δ = 1) [8] datasets. We also test on the considerably less hyperbolic Pub Med (δ = 3.5) [47] and Co RA (δ = 11) [46] datasets. |
| Dataset Splits | No | The paper mentions 'validation accuracies' in Table 2, but does not explicitly provide the specific percentages or counts for training, validation, and test splits for the datasets used. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments (e.g., specific GPU/CPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'Python 3.8'). |
| Experiment Setup | Yes | Appendix C.2: Model and Training Details: We use Adam [30] with a learning rate of 0.001 and weight decay 0.0001. We found that the MLP with 2 layers and a hidden dimension of 256 works best. We use a batch size of 256. |