reproducibilityindex.ai

Optimal Margin Distribution Clustering

Authors: Teng Zhang, Zhi-Hua Zhou

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on UCI data sets show that ODMC is signiﬁcantly better than compared methods, which veriﬁes the superiority of optimal margin distribution learning. In this section, we empirically evaluate the proposed method on 24 UCI data sets. Table 1 summarizes the statistics of these data sets.
Researcher Affiliation	Academia	Teng Zhang, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Nanjing University, Nanjing 210023, China Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210023, China {zhangt, zhouzh}@lamda.nju.edu.cn
Pseudocode	Yes	Algorithm 1 Stochastic mirror descent for ODMC
Open Source Code	No	The paper does not provide any explicit statement about making the source code for their methodology publicly available, nor does it include links to a code repository.
Open Datasets	Yes	Extensive experiments on UCI data sets show that ODMC is signiﬁcantly better than compared methods. In this section, we empirically evaluate the proposed method on 24 UCI data sets. Table 1 summarizes the statistics of these data sets.
Dataset Splits	No	The paper mentions parameter selection for the models (e.g., 'C or λ is selected from {1, 10, 100, 1000}', 'ν and θ are selected from [0.2, 0.4, 0.6, 0.8]') but does not explicitly describe the use of a validation dataset or specific data splitting methodology (like k-fold cross-validation or specific train/validation percentages) for this parameter selection.
Hardware Specification	Yes	All the experiments are performed with MATLAB 2017b on a machine with 8 2.60 GHz CPUs and 32GB main memory.
Software Dependencies	Yes	All the experiments are performed with MATLAB 2017b
Experiment Setup	Yes	For GMMC, Iter SVR, CPMMC, LG-MMC, ODMC, the parameters C or λ is selected from {1, 10, 100, 1000}. For ODMC, ν and θ are selected from [0.2, 0.4, 0.6, 0.8]. For all data sets, both the linear and Gaussian kernels are used. In particular, the width σ of Gaussian kernel is picked from {0.25 γ, 0.5 γ, γ, 2 γ, 4 γ}, where γ is the average distance between instances. The parameter of normalized cut is chosen from the same range of σ. The balance constraint is set in the same manner as in (Zhang, Tsang, and Kwok 2007), i.e., 0.03m for balanced data set and 0.3m for imbalanced data set.