Principles of Riemannian Geometry in Neural Networks

Authors: Michael Hauser, Asok Ray

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This study deals with neural networks in the sense of geometric transformations acting on the coordinate representation of the underlying data manifold which the data is sampled from. It forms part of an attempt to construct a formalized general theory of neural networks in the setting of Riemannian geometry. From this perspective, the following theoretical results are developed and proven for feedforward networks. [...] Toy experiments were run to confirm parts of the proposed theory, as well as to provide intuitions as to how neural networks operate on data. Section 7 Numerical experiments This section presents the results of numerical experiments used to understand the proposed theory.
Researcher Affiliation Academia Michael Hauser Department of Mechanical Engineering Pennsylvania State University State College, PA 16801 mzh190@psu.edu Asok Ray Department of Mechanical Engineering Pennsylvania State University State College, PA 16801 axr2@psu.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described. It mentions using Theano, which is a third-party library, but not their own implementation code.
Open Datasets Yes In [4], a deep residual convolution network was trained on Image Net in the usual fashion except parameter weights between residual blocks at the same dimension were shared, at a cost to the accuracy of only 0.2%. (...) The C0 network is a standard network, while the C1 network is a residual network and the C2 network also exhibits smooth layerwise transformations. All networks achieve 0.0% error rates. (...) Figure 2: Untangling the same spiral with 2-dimensional neural networks with different constraints on smoothness.
Dataset Splits No The paper mentions training data batch sizes (e.g., "A batch size of 300 for untangling data", "A batch size of 1000 for untangling data") but does not provide specific train/validation/test dataset splits (percentages, sample counts, or explicit splitting methodology).
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. It only mentions "GPU implementations" generally.
Software Dependencies No The paper mentions "GPU implementations of the neural networks are written in the Python library Theano [2, 16]". While Theano is a software, a specific version number for Theano used in the experiments is not provided, and the citations [2, 16] refer to papers about Theano rather than a version number itself.
Experiment Setup Yes The C hyperbolic tangent has been used for all experiments, with weights initialized according to [5]. (...) A batch size of 300 for untangling data. (...) A batch size of 1000 for untangling data. (...) The 10 layer network is unable to properly separate the data and achieves a 1% error rate, whereas the 20 and 40 layer networks both achieve 0% error rates.