reproducibilityindex.ai

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Authors: Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Q Davis, Adrian Weller

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the utility of UFOM in a synthetic experiment, data hypercleaning on MNIST (Le Cun et al., 2010), and few-shot-learning on CIFAR100 (Krizhevsky et al., 2009) as well as Omniglot (Lake et al., 2011). Full proofs are provided in Appendix E in the Supplement. and 5 Experiments We illustrate our theoretical ﬁndings on a synthetic experiment and then evaluate Adaptive UFOM on data hypercleaning and few-shot learning.
Researcher Affiliation	Collaboration	1University of Cambridge 2Google Research, Brain Team 3Columbia University 4Deepmind 5Stanford University 6The Alan Turing Institute.
Pseudocode	Yes	Algorithm 1 Outer SGD., Algorithm 2 Inner GD (exact)., Algorithm 3 Inner GD (FOM)., Algorithm 4 Inner GD (UFOM).
Open Source Code	No	The paper does not provide any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets	Yes	data hypercleaning on MNIST (Le Cun et al., 2010), and few-shot-learning on CIFAR100 (Krizhevsky et al., 2009) as well as Omniglot (Lake et al., 2011).
Dataset Splits	Yes	For that, we deﬁne θ R5000, \|ΩT \| = 1 and the inner loss Lin has the form Lin(θ, φ, T ) = P5000 i=1 σ(θ(i))l CCE(g(φ, Xi), Yi), where σ( ) is a sigmoid function, θ(i) is the ith element of θ and l CCE( , Y ) is a categorical cross entropy (CCE) with respect to a label Yi {0, . . . , 9}. Lout is deﬁned as a cross entropy on the validation set. and To sample from p(T ) in the K-shot m-way setting, m classes are chosen randomly and K +1 examples are drawn from each class: K examples for training and 1 for testing, i.e. s = m K, t = m.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud computing specifications) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup	Yes	As in (Shaban et al., 2018), we set r = 100 and α = 1. and We modify a setup of (Shaban et al., 2018) by using a twoinstead of one-layer feedforward network, with Re LU nonlinearity... and For Adaptive UFOM, on a validation score comparison we ﬁnd that qmin = 0.05, β = 0.99 performs reasonably well. Further, we empirically ﬁnd that Adaptive UFOM works best when (22) is modiﬁed so that D2 k = 0.1 D2 sm,k/(1 βkupd) and We reuse convolutional architectures for g(φ, X) from (Finn et al., 2017) and set inner-loop length to r = 10, as in (Nichol et al., 2018).