Generalization Bound of Gradient Descent for Non-Convex Metric Learning

Authors: MINGZHI DONG, Xiaochen Yang, Rui Zhu, Yujiang Wang, Jing-Hao Xue

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose a novel metric learning method as a concrete example of using the generalization PAC bound. ... The method is evaluated on 12 datasets and shows competitive performance against existing methods.
Researcher Affiliation Academia 1School of Computer Science, Fudan University, Shanghai, China 2Shanghai Key Laboratory of Data Science, Fudan University, Shanghai, China 3Department of Statistical Science, University College London, London, UK 4The Business School (formerly Cass), City, University of London, London, UK 5Department of Computing, Imperial College London, London, UK
Pseudocode No The paper describes algorithms (e.g., GD update rule, SMILE optimization) in text but does not provide structured pseudocode or an algorithm block.
Open Source Code Yes Code for the proposed method is available at http://github.com/xyang6/SMILE.
Open Datasets Yes The experiment focuses on binary classification of 12 publicly available datasets from the websites of UCI [13] and Delve [41]. Sample size and feature dimension are listed in Table 1 of Appendix D. All datasets are pre-processed by firstly subtracting the mean and dividing by the standard deviation, and then normalizing the L2-norm of each instance to 1.
Dataset Splits Yes For each dataset, we randomly select 60% of instances to form a training set and the rest are used for testing. This process is repeated 10 times and we report the mean accuracy and the standard deviation. 10-fold cross-validation is used to select the trade-off parameters in the compared algorithms...
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cluster specifications) used for running experiments.
Software Dependencies No NCA is implemented by using the dr Toolbox [45]; LMNN and ITML are implemented by using the metric-learn toolbox [11]; and R2LML, RVML, GMML, DMLMJ, and SNC are implemented by using the authors code. ... (by using MATLAB kmeans function with random initial values)
Experiment Setup Yes For the proposed SMILE, the parameters are set as follows: L is initialized as the identity matrix; rm are initialized as the k-means clustering centers of the positive and negative classes (by using MATLAB kmeans function with random initial values); the number of representative instances for each class is set as 2; the trade-off parameter λ is set as 1; and the learning rate α is set as 0.001. The maximum number of iterations is set as 5000 and the final result is based on the parameters at time t, which is the earliest time when the smallest training error is obtained, to conform to early stopping as suggested by Theorem 3.