Robust Distance Metric Learning via Simultaneous L1-Norm Minimization and Maximization
Authors: Hua Wang, Feiping Nie, Heng Huang
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed extensive empirical evaluations, where our new distance metric learning method outperforms related state-of-the-art methods in a variety of experimental settings. In this section, we evaluate the proposed method in the tasks of data clustering, where our goal is to examine the robustness of our new method under the conditions when data outliers or feature outliers are present. |
| Researcher Affiliation | Academia | Colorado School of Mines, Department of Electrical Engineering and Computer Science, Golden, Colorado 80401 and Computer Science and Engineering Department, University of Texas at Arlington, Arlington, TX, 76019 |
| Pseudocode | Yes | Algorithm 1 An efficient iterative algorithm to solve the general ℓ1-norm minmax problem with orthogonal constraint in Eq. (6). Algorithm 2 The algorithm to solve Eq. (9). Algorithm 3 An efficient iterative algorithm to solve the general ℓ1-norm minmax problem in Eq. (8). |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We experiment with four benchmark data sets downloaded from the UCI machine learning data repository, including the Breast, Diabetes, Iris and Protein data sets, and one image data set downloaded from the ORL database |
| Dataset Splits | No | The paper does not provide specific dataset split information (e.g., percentages, sample counts, or explicit train/validation splits) needed to reproduce the data partitioning. It uses entire datasets for clustering experiments after learning the metric. |
| Hardware Specification | Yes | We experiment on a Dell Power Edge 2900 server, which has two quad-core Intel Xeon 5300 CPUs at 3.0 GHz and 48G bytes memory. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | Empirically, we select r = min (d, 2c) in all our subsequent experiments, where d is the dimensionality of the original data space and c is the cluster number of the input data. For each different value of r, we repeat the experiment for 100 times to eliminate the difference caused by the constraint pickup and the initialization of K-means clustering. |