reproducibilityindex.ai

Amortized Proximal Optimization

Authors: Juhan Bae, Paul Vicol, Jeff Z. HaoChen, Roger B. Grosse

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically test APO for online adaptation of learning rates and structured preconditioning matrices for regression, image reconstruction, image classification, and natural language translation tasks.
Researcher Affiliation	Academia	Juhan Bae 1,2, Paul Vicol 1,2, Jeff Z. Hao Chen3, Roger Grosse1,2 1University of Toronto, 2Vector Institute, 3Stanford University
Pseudocode	Yes	Algorithm 1 Amortized Proximal Optimization (APO) Meta-Learning Optimization Parameters ϕ
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] We include code in the supplementary material.
Open Datasets	Yes	UCI Regression. Next, we validated APO-Precond on the Slice, Protein, and Parkinsons datasets from the UCI regression collection [18]. Citation [18] is 'UCI machine learning repository, 2017. URL http://archive.ics. uci.edu/ml.'
Dataset Splits	Yes	Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] We provide the training details for all of our experiments in Appendix C.
Hardware Specification	Yes	Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] We provide these details in Appendix C.1. Appendix C.1 states: 'Compute Infrastructure. All experiments were performed on Google Cloud Platform with NVIDIA Tesla V100 GPUs and TPUs.'
Software Dependencies	No	The paper mentions using PyTorch [67], JAX [14], and fairseq [65] but does not specify version numbers for these software components.
Experiment Setup	Yes	We trained Le Net [40], Alex Net [37], VGG-16 [71] (w/o batch norm [32]), Res Net-18, and Res Net32 [29] architectures for 200 epochs on batches of 128 images. For Res Net32, we trained for 400 epochs, and the decayed baseline used a step schedule with 10 decay at epochs 150 and 250, following [49].