A Gradient Method for Multilevel Optimization
Authors: Ryo Sato, Mirai Tanaka, Akiko Takeda
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show that a trilevel hyperparameter learning model considering data poisoning produces more stable prediction results than an existing bilevel hyperparameter learning model in noisy data settings. Numerical experiments To validate the effectiveness of our proposed method, we conducted numerical experiments on an artificial problem and a hyperparameter optimization problem arising from real data |
| Researcher Affiliation | Academia | Ryo Sato The University of Tokyo Mirai Tanaka The Institute of Statistical Mathematics RIKEN Akiko Takeda The University of Tokyo RIKEN |
| Pseudocode | Yes | Algorithm 1 Computation of x1 F1(x1) Input: x1: current value of the 1st level variable. {x(0) i }n i=2: initial values of the lower level iteration. Output: The exact value of x1 F1(x1). 1: g := (0, . . . , 0) . 2: for i := 2, . . . , n do 3: Zi := O. 4: for t := 1, . . . , Ti do 5: x(t) i := Φ(t) i (x1, x(T2) 2 , . . . , x(Ti 1) i 1 , x(t 1) i ). 6: B(t) i := Pi 1 l=2 Zl C(t) il + B(t) i . 7: Zi := Zi A(t) i + B(t) i . 8: for i = 2, . . . , n do 9: g := g + Zi xif1. 10: g := g + x1f1. 11: return g |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a link or explicit statement of code release) for open-source code related to the methodology. |
| Open Datasets | Yes | We compared these methods on the regression tasks with the following datasets: the diabetes dataset [10], the (red and white) wine quality datasets [8], the Boston dataset [14]. [10] D. Dua and C. Graff. UCI machine learning repository, 2017. http://archive.ics.uci.edu/ml. |
| Dataset Splits | Yes | For each dataset, we standardized each feature and the objective variable; randomly chose 40 rows for training data (Xtrain, ytrain), chose other 100 rows for validation data (Xvalid, yvalid), and used the rest of the rows for test data. |
| Hardware Specification | Yes | In our numerical experiments, we implemented all codes with Python 3.9.2 and JAX 0.2.10 for automatic differentiation and executed them on a computer with 12 cores of Intel Core i7-7800X CPU 3.50 GHz, 64 GB RAM, Ubuntu OS 20.04.2 LTS. |
| Software Dependencies | Yes | In our numerical experiments, we implemented all codes with Python 3.9.2 and JAX 0.2.10 for automatic differentiation and executed them on a computer with 12 cores of Intel Core i7-7800X CPU 3.50 GHz, 64 GB RAM, Ubuntu OS 20.04.2 LTS. |
| Experiment Setup | Yes | We set T2 = 30 and T3 = 3 for the trilevel model and T2 = 30 for the bilevel model. In each dataset, we used the same initialization and step sizes in the updates of λ and θ in trilevel and bilevel models. We set an early-stopping condition on learning parameters: after 1000 times of updates on the model parameter, if one time of update on hyperparameter λ did not improve test error, terminate the iteration and return the parameters at that time. |