Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Bilevel Programming for Hyperparameter Optimization and Meta-Learning
Authors: Luca Franceschi, Paolo Frasconi, Saverio Salzo, Riccardo Grazzi, Massimiliano Pontil
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The aim of the following experiments is threefold. First, we investigate the impact of the number of iterations of the optimization dynamics on the quality of the solution on a simple multiclass classification problem. Second, we test our hyper-representation method in the context of few-shot learning on two benchmark datasets. Finally, we constrast the bilevel ML approach against classical approaches to learn shared representations. |
| Researcher Affiliation | Academia | 1Computational Statistics and Machine Learning, Istituto Italiano di Tecnologia, Genoa, Italy 2Department of Computer Science, University College London, London, UK 3Department of Information Engineering, Universit a degli Studi di Firenze, Florence, Italy. |
| Pseudocode | Yes | Algorithm 1. Reverse-HG for Hyper-representation |
| Open Source Code | Yes | The code for reproducing the experiments, based on the package FAR-HO (https://bit.ly/far-ho), is available at https://bit.ly/hyper-repr |
| Open Datasets | Yes | OMNIGLOT (Lake et al., 2015), a dataset that contains examples of 1623 different handwritten characters from 50 alphabets. ... MINIIMAGENET (Vinyals et al., 2016), a subset of Image Net (Deng et al., 2009), that contains 60000 downsampled images from 100 different classes. |
| Dataset Splits | Yes | A training set Dtr and a validation set Dval, each consisting of three randomly drawn examples per class, were sampled to form the HO problem. ... each meta-dataset consists of a pool of samples belonging to different (non-overlapping between separate meta-dataset) classes, which can be combined to form ground classification datasets Dj = Dj tr Dj val with 5 or 20 classes (for Omniglot). |
| Hardware Specification | Yes | Table 2. Execution times on a NVidia Tesla M40 GPU. |
| Software Dependencies | No | The paper mentions using a package 'FAR-HO' but does not specify version numbers for this or any other software dependencies (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The optimization of H is performed with gradient descent with momentum, with same initialization, step size and momentum factor for each run. ... We initialize ground models parameters wj to 0 and... we perform T gradient descent steps, where T is treated as a ML hyperparameter that has to be validated. ... We compute a stochastic approximation of f T (λ) with Algorithm 1 and use Adam with decaying learning rate to optimize λ. |