Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Adversarial Monte Carlo Meta-Learning of Optimal Prediction Procedures

Authors: Alex Luedtke, Incheoul Chung, Oleg Sofrygin

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Section 5, we apply our algorithm in two settings and learn estimators that outperform standard approaches in numerical experiments. In Section 6, we also evaluate the performance of these learned estimators in data experiments.
Researcher Affiliation	Collaboration	Alex Luedtke EMAIL Incheoul Chung EMAIL Department of Statistics University of Washington Seattle, WA 98195-4322, USA Oleg Sofrygin EMAIL Division of Research Kaiser Permanente Northern California Oakland, CA 94612-2304, USA
Pseudocode	Yes	Algorithm 1 Adversarially learn an estimator. 1: Initialize estimator Tt, generator Gg, step sizes η1, η2. 2: for K iterations do 3: for j = 1, 2 do 4: Independently draw U νu and V0, . . . , Vp iid νv. 5: Let P = Gg(U). ... Algorithm 2 Use data d to obtain prediction at x0. 1: Preprocess: Let x0 0 := x0 x / s(x) and deﬁne d0 Rn p 2 so that d0 i 1 = xi x / s(x) for all i = 1, . . . , n and d0 j2 = y y / s(y) for all j = 1, . . . , p. 2: Module 1: d1 := m1(d0). d1 Rn p o1
Open Source Code	Yes	All experiments were run in Pytorch 1.0.1 on Tesla V100 GPUs using Amazon Web Services. The code used to conduct the experiments can be found at https://github.com/ alexluedtke12/amc-meta-learning-of-optimal-prediction-procedures.
Open Datasets	Yes	Our experiments make use of ten datasets. Six of these datasets are available through the University of California, Irvine (UCI) Machine Learning Repository (Dua and Graﬀ, 2017), three were used to illustrate supervised learning machines in popular statistical learning textbooks (Friedman et al., 2001; James et al., 2013), and one was used as an illustrative example in the paper that introduced FLAM (Petersen et al., 2016).
Dataset Splits	Yes	The first includes only the AMC Linear and AMC FLAM estimators as base learners. The second only includes the OLS, lasso, and FLAM estimators. The third includes all ﬁve of these estimators. Predictions of the base learners were combined using 10-fold cross-validation. ... We evaluated the performance of AMC Linear and AMC FLAM in the 5 datasets that have 10 or more features by randomly selecting 100 observations and 10 features from each dataset and evaluating MSE on the held out observations. This and all other Monte Carlo evaluations of MSE described in what follows were repeated 200 times and averaged across the replications.
Hardware Specification	Yes	All experiments were run in Pytorch 1.0.1 on Tesla V100 GPUs using Amazon Web Services.
Software Dependencies	Yes	All gradients in the algorithm can be computed via backpropagation using standard software in our experiments, we used Pytorch for this purpose (Paszke et al., 2019). ... All experiments were run in Pytorch 1.0.1 on Tesla V100 GPUs using Amazon Web Services. ... we compared AMC s performance to ordinary least squares (OLS) and lasso (Tibshirani, 1996) with tuning parameter selected by 10-fold cross-validation, as implemented in scikit-learn (Pedregosa et al., 2011).
Experiment Setup	Yes	In each example, the collection of estimators T is parameterized as the network architecture introduced in Section 4.2 with o1 = o2 = 50, o3 = 10, h1 = h3 = 10, h2 = h4 = 3, and, for k = 1, 2, 3, 4, wk = 100. For each module, we use the leaky Re LU activation q(z) := max{z, 0} + 0.01 min{z, 0}. ... In all settings, we set (β2, ϵ) = (0.999, 10 8). Whenever we were updating the prior network, we set the momentum parameter β1 to 0, and whenever we were updating the estimator network, we set the momentum parameter to 0.25. The parameter α differed across settings. In the sparse linear regression setting with s = 1, we found that choosing α small helped to improve stability. Speciﬁcally, we let α = 0.0002 when updating both the estimator and prior networks. In the sparse linear regression setting with s = 5, we used the more commonly chosen parameter setting of α = 0.001 for both networks. In the FLAM example, we chose α = 0.001 and α = 0.005 for the estimator and prior networks, respectively. The learning rates were of the estimator and prior networks were decayed at rates t 0.15 and t 0.25, respectively. ... In all settings, the prior and estimator were updated over 106 iterations using batches of 100 datasets. For each dataset, performance is evaluated at 100 values of x0.