Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Debiasing a First-order Heuristic for Approximate Bi-level Optimization
Authors: Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Q Davis, Adrian Weller
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the utility of UFOM in a synthetic experiment, data hypercleaning on MNIST (Le Cun et al., 2010), and few-shot-learning on CIFAR100 (Krizhevsky et al., 2009) as well as Omniglot (Lake et al., 2011). Full proofs are provided in Appendix E in the Supplement. and 5 Experiments We illustrate our theoretical findings on a synthetic experiment and then evaluate Adaptive UFOM on data hypercleaning and few-shot learning. |
| Researcher Affiliation | Collaboration | 1University of Cambridge 2Google Research, Brain Team 3Columbia University 4Deepmind 5Stanford University 6The Alan Turing Institute. |
| Pseudocode | Yes | Algorithm 1 Outer SGD., Algorithm 2 Inner GD (exact)., Algorithm 3 Inner GD (FOM)., Algorithm 4 Inner GD (UFOM). |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology. |
| Open Datasets | Yes | data hypercleaning on MNIST (Le Cun et al., 2010), and few-shot-learning on CIFAR100 (Krizhevsky et al., 2009) as well as Omniglot (Lake et al., 2011). |
| Dataset Splits | Yes | For that, we define θ R5000, |ΩT | = 1 and the inner loss Lin has the form Lin(θ, φ, T ) = P5000 i=1 σ(θ(i))l CCE(g(φ, Xi), Yi), where σ( ) is a sigmoid function, θ(i) is the ith element of θ and l CCE( , Y ) is a categorical cross entropy (CCE) with respect to a label Yi {0, . . . , 9}. Lout is defined as a cross entropy on the validation set. and To sample from p(T ) in the K-shot m-way setting, m classes are chosen randomly and K +1 examples are drawn from each class: K examples for training and 1 for testing, i.e. s = m K, t = m. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud computing specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | Yes | As in (Shaban et al., 2018), we set r = 100 and α = 1. and We modify a setup of (Shaban et al., 2018) by using a twoinstead of one-layer feedforward network, with Re LU nonlinearity... and For Adaptive UFOM, on a validation score comparison we find that qmin = 0.05, β = 0.99 performs reasonably well. Further, we empirically find that Adaptive UFOM works best when (22) is modified so that D2 k = 0.1 D2 sm,k/(1 βkupd) and We reuse convolutional architectures for g(φ, X) from (Finn et al., 2017) and set inner-loop length to r = 10, as in (Nichol et al., 2018). |