Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Efficient Label Propagation

Authors: Yasuhiro Fujiwara, Go Irie

ICML 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate the signiﬁcant superiority of our algorithm over existing label propagation methods. We performed experiments to compare the proposed approach to the optimal solution and the power method in terms of efﬁciency and effectiveness. The experiments used the following standard datasets. Reuters-21578, COIL-100.
Researcher Affiliation	Industry	Yasuhiro Fujiwara EMAIL NTT Software Innovation Center, 3-9-11 Midori-cho Musashino-shi, Tokyo, Japan Go Irie EMAIL NTT Media Intelligence Laboratories, 1-1 Hikarinooka Yokosuka-shi, Kanagawa, Japan
Pseudocode	Yes	Algorithm 1 Proposed algorithm
Open Source Code	No	No, the paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	Reuters-21578 1: This dataset contains documents released by the Reuters newswire. Documents with multiple category labels were discarded. As a result, it contained 8, 293 documents of 65 categories. tf-idf was used as the document feature; it has 18, 933 dimensions. 1http://www.daviddlewis.com/resources/testcollections/reuters21578/ COIL-100 2: This dataset contains images of 100 objects; the number of object labels is 100. Images of the objects were taken at pose intervals of 5 degrees; 72 poses per object resulting in 7, 200 images. We resized all images to 32 32 and used RGB pixel values as the feature vector, resulting in 3, 048 dimensions. 2http://www.cs.columbia.edu/CAVE/software/softlib/coil-100.php
Dataset Splits	No	No, the paper mentions that “10 data points in each category/object were initially labeled,” but it does not provide specific train/validation/test split percentages or counts for the entire dataset needed for full reproducibility of data partitioning in a conventional sense.
Hardware Specification	Yes	All experiments were conducted on a Linux 2.70 GHz Intel Xeon sever.
Software Dependencies	No	No, the paper mentions general software context but does not provide specific names of libraries, solvers, or packages with their version numbers.
Experiment Setup	Yes	Following previous papers (Zhou et al., 2003; Xu et al., 2011), we set α = 0.99 and stop iterating the power method when the residual drops below 10^-4. In the experiments, 10 data points in each category/object were initially labeled. 100 nearest neighbors were used to construct each graph. We set the parameter λ = 10 on sparse L1 graph.