reproducibilityindex.ai

Subsampling Methods for Persistent Homology

Authors: Frederic Chazal, Brittany Fasy, Fabrizio Lecci, Bertrand Michel, Alessandro Rinaldo, Larry Wasserman

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 6, we apply our methods to two examples. Since computing the persistent homology of the Vietoris Rips (VR) ﬁltrations built on top of a large samples is infeasible, we resort to the subsampling strategy described in Section 3. More formally, let XN = {x1, . . . , x N} be a large point cloud. We draw n subsamples, each of size m N points, from µ, the discrete uniform measure on XN. First, we use a toy example to compare the time complexity of computing the persistent homology of the entire point cloud, with the complexity of the subsampling approach.
Researcher Affiliation	Academia	Frederic Chazal FREDERIC.CHAZAL@INRIA.FR INRIA Saclay, Palaiseau, 91120, France Brittany Terese Fasy BRITTANY@FASY.US Computer Science Department, Tulane University, New Orleans, LA 70118 Fabrizio Lecci LECCI@CMU.EDU Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213 Bertrand Michel BERTRAND.MICHEL@UPMC.FR LSTA, Universit e Pierre et Marie Curie (UPMC), Paris, 75005, France Alessandro Rinaldo ARINALDO@CMU.EDU Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213 Larry Wasserman LARRY@CMU.EDU Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Software. The computations in this paper were done using the R package TDA (Fasy et al., 2014a). The package includes a series of tools for the statistical analysis of persistent homology, including the methods described in Fasy et al. (2014b), Chazal et al. (2014b), Chazal et al. (2014a), and this paper.
Open Datasets	Yes	We use the publicly available database of triangulated shapes (Sumner & Popovi c, 2004). The dataset is publicly available at the UCI Machine Learning Repository1 and is described in Barshan & Y uksek (2013), where it is used to classify 19 activities performed by eight people wearing sensor units on the chest, arms, and legs. For ease of illustration, we report here the results on four activities (walking, stepper, cross trainer, jumping) performed by a single person (#1). (Footnote 1: http://archive.ics.uci.edu/ml/datasets/Daily+and+ Sports+Activities)
Dataset Splits	No	The paper does not explicitly provide details about training, validation, or test dataset splits. It only mentions general data usage in experiments.
Hardware Specification	Yes	required 28.34 seconds on a Macbook Pro with 2.8 GHz processor and 16 GB RAM.
Software Dependencies	No	The computations in this paper were done using the R package TDA (Fasy et al., 2014a). However, specific version numbers for the R package TDA or R itself are not provided.
Experiment Setup	Yes	More formally, let XN = {x1, . . . , x N} be a large point cloud. We draw n subsamples, each of size m N points, from µ, the discrete uniform measure on XN. The average landscape on the right plot is computed using n = 10 subsamples of size m = 100. For n = 100 times we subsample m = 300 points from each shape. For n = 80 times, we subsample m = 200 points from the point cloud of each activity.