reproducibilityindex.ai

Provable Bounds for Learning Some Deep Representations

Authors: Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We did a basic implementation of our algorithms and ran experiments on synthetic data. The results show that our analysis has the right asymptotics, and that the constants involved are all small. As is to be expected, the most expensive step is that of ﬁnding the negative edges (since the algorithm is to eliminate non-edges ). However, stopping the algorithm after a certain number of iterations gives a reasonable over-estimate for the negative edges. The following are the details of one experiment: a network with two hidden layers and one observed layer (ℓ= 2, both layers use thresholds) with n = 5000, d = 16 and ρℓn = 4; the weights are random { 1} but we made sure that each node has exactly 8 positive edges and 8 negative edges. Using 500, 000 samples, in less than one hour (on a single 2GHz processor) the algorithm correctly learned the ﬁrst layer, and the positive edges of the second layer. Learning the negative edges of second layer (as per our analysis) requires many more samples. However using 5 106 samples, the algorithm makes only 10 errors in learning negative edges.
Researcher Affiliation	Collaboration	Sanjeev Arora ARORA@CS.PRINCETON.EDU Princeton University, Computer Science Department and Center for Computational Intractability, Princeton 08540, USA Aditya Bhaskara BHASKARA@CS.PRINCETON.EDU Google Research, New York, NY 10011, USA Rong Ge RONGGE@MICROSOFT.COM Microsoft Research, Cambridge, MA 02142, USA Tengyu Ma TENGYU@CS.PRINCETON.EDU Princeton University, Computer Science Department and Center for Computational Intractability, Princeton 08540, USA
Pseudocode	Yes	Algorithm 1 High Level Algorithm ... Algorithm 2 Pairwise Graph ... Algorithm 3 Partial Encoder ... Algorithm 4 Learning Graph ... Algorithm 5 Recover Graph
Open Source Code	No	The paper does not provide any explicit statements or links to open-source code for the methodology described.
Open Datasets	No	The paper mentions running experiments on "synthetic data" but does not provide concrete access information (e.g., link, citation, generation script) for this data to be publicly available.
Dataset Splits	No	The paper does not provide specific dataset split information (e.g., percentages, sample counts, or references to predefined splits) for training, validation, or testing.
Hardware Specification	Yes	Using 500, 000 samples, in less than one hour (on a single 2GHz processor) the algorithm correctly learned the ﬁrst layer, and the positive edges of the second layer.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names with versions) needed to replicate the experiments.
Experiment Setup	Yes	The following are the details of one experiment: a network with two hidden layers and one observed layer (ℓ= 2, both layers use thresholds) with n = 5000, d = 16 and ρℓn = 4; the weights are random { 1} but we made sure that each node has exactly 8 positive edges and 8 negative edges.