Diffusion Curvature for Estimating Local Curvature in High Dimensional Data

Authors: Dhananjay Bhaskar, Kincaid MacDonald, Oluwadamilola Fasina, Dawson Thomas, Bastian Rieck, Ian Adelstein, Smita Krishnaswamy

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show applications of both estimations on toy data, single-cell data and on estimating local Hessian matrices of neural network loss landscapes.
Researcher Affiliation Academia Dhananjay Bhaskar Department of Genetics Yale University dhananjay.bhaskar@yale.edu Kincaid Mac Donald Department of Mathematics Yale University kincaid.macdonald@yale.edu Oluwadamilola Fasina Applied Mathematics Program Yale University dami.fasina@yale.edu Dawson Thomas Department of Mathematics Department of Physics Yale University dawson.thomas@yale.edu Bastian Rieck Institute of AI for Health Helmholtz Pioneer Campus, Helmholtz Munich bastian.rieck@helmholtz-muenchen.de Ian Adelstein Department of Mathematics Yale University ian.adelstein@yale.edu Smita Krishnaswamy Department of Computer Science Department of Genetics Applied Mathematics Program Program for Computational Biology and Bioinformatics Yale University smita.krishnaswamy@yale.edu
Pseudocode No The paper does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code Yes 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] See Section 4, additional details are provided in the Supplementary Materials. ... 4. If you are using existing assets... (c) Did you include any new assets either in the supplemental material or as a URL? [N/A] We make our data available as part of the code we release.
Open Datasets Yes We also estimated the curvature of a publicly available single-cell point cloud dataset... [23] Eli R. Zunder, Ernesto Lujan, Yury Goltsev, Marius Wernig, and Garry P. Nolan. A continuous molecular roadmap to i PSC reprogramming through progression analysis of single-cell mass cytometry. Cell Stem Cell, 16(3):323 337, March 2015. ... for a feed-forward neural network trained to classify MNIST digits.
Dataset Splits Yes 3. If you ran experiments... (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Section 4, additional details are provided in the Supplementary Materials.
Hardware Specification Yes Training and testing were done on 8 core Tesla K80 GPUs with 24 GB memory/chip.
Software Dependencies No The paper mentions implementing a neural network and details about the training process (e.g., 'Curve Net takes as input points'), and states that training details are in supplementary materials. However, it does not explicitly list specific software versions (e.g., Python, PyTorch, TensorFlow versions) in the main text.
Experiment Setup Yes We trained Curve Net on N = 1000 samples from idealized quadratic surfaces. Later, we sampled 1000 points in the local neighborhood of the model parameters to estimate curvature of the loss landscape via the diffusion map embedding. ... We used intrinsic dimensions of k = {2, 3, . . . 20} to generate idealized surfaces to ensure that Curve Net can approximate the Hessian within a reasonable range of intrinsic dimensions. We trained on 5000 different randomly generated quadrics, all sampled using N = 1000 points, for each intrinsic dimension.