Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Fast Adaptive Randomized PCA Algorithm

Authors: Xu Feng, Wenjian Yu

IJCAI 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that far PCA is much faster than the baseline methods (rand QB EI, rand UBV and svds) in practical setting of multi-thread computing, while producing nearly optimal results of adpative PCA.
Researcher Affiliation	Academia	Xu Feng and Wenjian Yu Department of Computer Science and Technology, BNRist, Tsinghua University, Beijing, China EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Basic randomized PCA with power iteration ... Algorithm 5 Fast adaptive randomized PCA (far PCA)
Open Source Code	Yes	The codes of far PCA are shared on Git Hub (https://github.com/XuFengthucs/farPCA).
Open Datasets	Yes	Four real-world matrices are considered for testing. ... The other three are sparse matrices: an 82, 168 82, 168 social network matrix from SNAP [Leskovec and Krevl, 2014] with 948,464 nonzero elements, a 138, 493 26, 744 matrix from Movielens dataset [Harper and Konstan, 2016] named Movielens-20m with 20,000,263 nonzero elements, and a 270, 896 45, 115 matrix from Movielens dataset named Movielens with 26,024,289 nonzero elements, which is larger than Movielens-20m.
Dataset Splits	No	The paper refers to 'real-world test cases' but does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits with citations).
Hardware Specification	Yes	All experiments are carried out on a Ubuntu server with two 8-core Intel Xeon CPU (at 2.10 GHz) and 512 GB RAM.
Software Dependencies	Yes	The proposed algorithms are implemented both in Matlab and in C with MKL2 and Open MP directives for multi-thread parallel computing. ... svds in Matlab 2020b is used for computing the accurate results.
Experiment Setup	Yes	The error tolerance is set ε = 0.1 A F for the case Image, and ε = 0.5 A F for the rest cases. The block parameter b is set to min(m, n)/100. ... we vary power parameter p while keeping b = 20 and k = 200