Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Convergence of Distributed Approximate Newton Methods: Globalization, Sharper Bounds and Beyond

Authors: Xiao-Tong Yuan, Ping Li

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical evidence is provided to conﬁrm the theoretical and practical advantages of our methods. Numerical evaluation results are presented and discussed in Section 4. Finally, we conclude this article in Section 5. All the technical proofs of theoretical results are deferred to the appendix section.
Researcher Affiliation	Industry	Xiao-Tong Yuan EMAIL Cognitive Computing Lab Baidu Research Beijing 100193, China Ping Li EMAIL Cognitive Computing Lab Baidu Research Bellevue, WA 98004, USA
Pseudocode	Yes	Algorithm 1: DANE with backtracking Line Search: DANE-LS(γ, ρ, ν) Algorithm 2: DANE with Heavy-Ball acceleration: DANE-HB(γ, β) Algorithm 3: Distributed Doubly Approximate Newton: D2ANE(γ, β, ℓ)
Open Source Code	No	The paper does not contain an explicit statement about the release of their own source code, nor does it provide a link to a code repository for the described methodology. It mentions using 'SGDLibrary (Kasai, 2017)' as a third-party solver.
Open Datasets	Yes	Next, we evaluate the convergence performance of the considered algorithms on two real data sets gisette (Guyon et al., 2005) (p = 5000, N = 6000) and rcv1.binary (Lewis et al., 2004) (p = 47236, N = 20242).
Dataset Splits	No	We replicate each experiment 10 times over random split of data and report the results in mean-value along with error bar.
Hardware Specification	Yes	We simulate the distributed environment on a single server powered by dual Intel(R) Xeon(R) E5-2630V4@2.2GHz CPU with multiple logic processors simulating multiple machines.
Software Dependencies	Yes	All the considered methods are implemented in Matlab R2018b on Microsoft Windows 10. The local subproblems on the master machine are solved by an SVRG solver from SGDLibrary (Kasai, 2017)
Experiment Setup	Yes	We initialize w(0) = 0 throughout our numerical study. For our simulation study, we test with feature dimensions p {200, 500}. We ﬁx N = 10p, µ = 1/ N, and study the impact of varying number of machines m and regularization γ = O(1/ n) on the needed rounds of communication to reach sub-optimality ϵ = 10 6. For each data set, we ﬁx the regularization parameter µ = 10 5 and test with m {4, 16, 32}.