Efficient Hyper-parameter Optimization with Cubic Regularization

Authors: Zhenqian Shen, Hansi Yang, Yong Li, James Kwok, Quanming Yao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic and real-world data demonstrate the effectiveness of our proposed method.
Researcher Affiliation Academia 1Department of Electronic Engineering, Tsinghua University, Beijing, China 2Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Hong Kong SAR, China
Pseudocode Yes Algorithm 1 Hyper-parameter optimization with cubic regularization.
Open Source Code No The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We use CIFAR-10 dataset for experiments with 50k, 5k, 5k image samples as training, validation, test data, respectively.Two well-known knowledge graph datasets, FB15k237 [38] and WN18RR [39], are used in experiments and their statistics are in Appendix E.2.
Dataset Splits Yes We use CIFAR-10 dataset for experiments with 50k, 5k, 5k image samples as training, validation, test data, respectively.
Hardware Specification Yes Experiments are conducted on a 24GB NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies No The paper does not provide specific software names with version numbers for reproducibility.
Experiment Setup Yes For the mask of i-th dimension zi, we use sigmoid function to represent the probability to mask that dimension, i.e., p i(zi = 1) = 1/(1+exp( i)) and p i(zi = 0) = 1 p i(zi = 1). As all hyper-parameters considered in this application are discrete, we choose softmax-liked distributions to represent the probability to select a specific value for each hyper-parameter. The hyper-parameter z is divided into two parts: z ( , {βi}), and Rz(t) is parameterized as follows: Rz(t) PI i=1 iri(t; βi)