Community Recovery in Graphs with Locality

Authors: Yuxin Chen, Govinda Kamath, Changho Suh, David Tse

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To verify the practical applicability of the proposed algorithms, we have conducted simulations in various settings. All these experiments focused on graphs with n = 100, 000 vertices, and used an error rate of θ = 10% unless otherwise noted. For each point, the empirical success rates averaged over 10 Monte Carlo runs are reported. To evaluate the performance of our algorithm on real data, we ran Spectral-Stitching for Chromosomes 1-22 on the NA12878 data-set made available by 10x-Genomics (10x Genomics, 2015).
Researcher Affiliation Academia Yuxin Chen YXCHEN@STANFORD.EDU Govinda M. Kamath GKAMATH@STANFORD.EDU Changho Suh CHSUH@KAIST.AC.KR David Tse + DNTSE@STANFORD.EDU Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA Department of Electrical Engineering, KAIST, Daejeon 305-701, Korea + Department of EECS, University of California, Berkeley CA 94720, USA
Pseudocode Yes Algorithm 1: Spectral-Expanding and Algorithm 2: Spectral-Stitching
Open Source Code No The paper does not contain any explicit statements about releasing source code for the described methodology or links to a repository.
Open Datasets Yes To evaluate the performance of our algorithm on real data, we ran Spectral-Stitching for Chromosomes 1-22 on the NA12878 data-set made available by 10x-Genomics (10x Genomics, 2015). The nominal error rate per read is p = 1%, and the average number of SNPs touched by each sample is L [6, 7]. The number of SNPs n ranges from 34240 to 191829, with the sample size m from 102633 to 574189.
Dataset Splits No The paper describes experiments and simulations using n = 100,000 vertices and Monte Carlo runs, but does not provide specific details on dataset splits (e.g., train/validation/test percentages or counts) or cross-validation setup.
Hardware Specification Yes The time taken to run Spectral-Expanding on a Mac Book Pro equipped with a 2.9 GHz Intel Core i5 and 8GB of memory over rings Rr, where n = 100, 000, θ = 10% and m = 1.5m .
Software Dependencies No The paper does not provide specific software dependencies or library names with version numbers needed to replicate the experiment.
Experiment Setup Yes All these experiments focused on graphs with n = 100, 000 vertices, and used an error rate of θ = 10% unless otherwise noted. For each point, the empirical success rates averaged over 10 Monte Carlo runs are reported. The nominal error rate per read is p = 1%, and the average number of SNPs touched by each sample is L [6, 7]. The number of SNPs n ranges from 34240 to 191829, with the sample size m from 102633 to 574189. Here, we split all vertices into overlapping subsets of size W = 100.