A General Representation Learning Framework with Generalization Performance Guarantees

Authors: Junbiao Cui, Jianqing Liang, Qin Yue, Jiye Liang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, extensive experiments verify the effectiveness of the proposed methods.
Researcher Affiliation Academia 1Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan 030006, Shanxi, China. Correspondence to: Jiye Liang <ljy@sxu.edu.cn>.
Pseudocode Yes Algorithm 1 Solving g1(φ) in Formula (10), Algorithm 2 Solving g2(φ) in Formula (10), Algorithm 3 VC Dimension based Kernel Selection, Algorithm 4 VC Dimension based DNN Boosting Framework
Open Source Code Yes The codes of the proposed methods are available at https://github.com/Junbiao Cui/GRLF_GPG.
Open Datasets Yes Data sets There are 15 UCI 4 binary classification data sets and their basic information is given in Table 5. (4https://archive.ics.uci.edu/ml/index.php) and MNIST 5 is a handwritten digits classification data set... 5http://yann.lecun.com/exdb/mnist/ and CIFAR10 6 is a visual objects classification data set... 6http://www.cs.toronto.edu/ kriz/cifar.html
Dataset Splits Yes 5-fold cross validation is used to estimate the generalization performance of each candidate kernel function. and For each data set, 80% samples are selected randomly as training data set and the remaining samples are selected as test data set.
Hardware Specification Yes The experiments of the proposed method implemented based on Pytorch are conducted on NVIDIA Ge Force RTX 3090.
Software Dependencies No The basic learner SVM is implemented by Scikit-learn 3. All parameters adopt the default settings except for kernel function. and The Adam (Kingma & Ba, 2015) used in Algorithm 1 and 2 is implemented by Pytorch, and footnote 2https://pytorch.org/ which implicitly refers to a version 2. However, Scikit-learn is mentioned without a specific version, and Adam is mentioned without a version.
Experiment Setup Yes Candidate functions Gaussian kernel φ (xi; δ)T φ (xj; δ) = exp δ xi xj 2 2 , δ < 0 is used. And there are 2000 kernel parameters, i.e., δk = δmin + (k 1) δmax δmin 2000 , k = 1, 2, , 2000, where δmin = 200, δmax = 10 5. and In the whole experiment, T = 300, ϵ = 10 10 are adopted for Algorithm 1 and 2. and In the whole experiment, the set of candidate learning rate of Adam is {5 10 4, 10 3, 5 10 3}, and the rest parameters adopt the default settings. At the same time, the number of epoch is 500. and For the proposed method, the set of candidate trade-off parameter γVC is {10 2, 10 1, 1}.