Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Authors: Lucy L Gao, Jane Ye, Haian Yin, Shangzhi Zeng, Jin Zhang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments confirm our theoretical findings and show that the proposed VF-i DCA yields superior performance when applied to tune hyperparameters.
Researcher Affiliation Academia 1Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada 2Department of Mathematics and Statistics, University of Victoria, Victoria, British Columbia, Canada 3Department of Mathematics, SUSTech International Center for Mathematics, Southern University of Science and Technology, Shenzhen, Guangdong, China 4National Center for Applied Mathematics Shenzhen, Shenzhen, Guangdong, China.
Pseudocode Yes Algorithm 1 VF-i DCA
Open Source Code Yes All algorithms were implemented in Python and the software package used for reproduce our experiments is available at https://github. com/SUSTech-Optimization/VF-i DCA.
Open Datasets Yes All datasets are from the LIBSVM repository (Chang & Lin, 2011)1.
Dataset Splits Yes For datasets gisette, duke breast-cancer, sensit, we randomly extracted 50, 11, 25 examples as training set, respectively; 50, 11, 25 examples as validation set, respectively; and the remaining for testing.
Hardware Specification Yes All experiments run on a computer with Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz and 16.00 GB memory.
Software Dependencies No All algorithms were implemented in Python and the software package used for reproduce our experiments is available at https://github. com/SUSTech-Optimization/VF-i DCA. The CVXPY package is applied with the open source solvers ECOS and SCS only. Specific version numbers for these software dependencies are not provided.
Experiment Setup Yes As for the paramters δα and cα in VF-i DCA, we used δα = 5 for all the experiments, and we adopted cα = 0.1 in sparse group lasso while we used cα = 1 for the other applications. ... VF-i DCA was stopped when ( zk+1 zk p 1 + zk 2 , tk+1 ) < tol, and we set tol = 0.1. ... For VF-i DCA, the initial guesses for r1 and r2 were 10 and 5, respectively.