Optimal Margin Distribution Clustering
Authors: Teng Zhang, Zhi-Hua Zhou
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on UCI data sets show that ODMC is significantly better than compared methods, which verifies the superiority of optimal margin distribution learning. In this section, we empirically evaluate the proposed method on 24 UCI data sets. Table 1 summarizes the statistics of these data sets. |
| Researcher Affiliation | Academia | Teng Zhang, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210023, China {zhangt, zhouzh}@lamda.nju.edu.cn |
| Pseudocode | Yes | Algorithm 1 Stochastic mirror descent for ODMC |
| Open Source Code | No | The paper does not provide any explicit statement about making the source code for their methodology publicly available, nor does it include links to a code repository. |
| Open Datasets | Yes | Extensive experiments on UCI data sets show that ODMC is significantly better than compared methods. In this section, we empirically evaluate the proposed method on 24 UCI data sets. Table 1 summarizes the statistics of these data sets. |
| Dataset Splits | No | The paper mentions parameter selection for the models (e.g., 'C or λ is selected from {1, 10, 100, 1000}', 'ν and θ are selected from [0.2, 0.4, 0.6, 0.8]') but does not explicitly describe the use of a validation dataset or specific data splitting methodology (like k-fold cross-validation or specific train/validation percentages) for this parameter selection. |
| Hardware Specification | Yes | All the experiments are performed with MATLAB 2017b on a machine with 8 2.60 GHz CPUs and 32GB main memory. |
| Software Dependencies | Yes | All the experiments are performed with MATLAB 2017b |
| Experiment Setup | Yes | For GMMC, Iter SVR, CPMMC, LG-MMC, ODMC, the parameters C or λ is selected from {1, 10, 100, 1000}. For ODMC, ν and θ are selected from [0.2, 0.4, 0.6, 0.8]. For all data sets, both the linear and Gaussian kernels are used. In particular, the width σ of Gaussian kernel is picked from {0.25 γ, 0.5 γ, γ, 2 γ, 4 γ}, where γ is the average distance between instances. The parameter of normalized cut is chosen from the same range of σ. The balance constraint is set in the same manner as in (Zhang, Tsang, and Kwok 2007), i.e., 0.03m for balanced data set and 0.3m for imbalanced data set. |