Multidimensional Fractional Programming for Normalized Cuts

Authors: Yannan Chen, Beichen Huang, Licheng Zhao, Kaiming Shen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments We validate the performance of the proposed FPC algorithm on 8 common datasets as summarized in Table 1. The benchmarks are the SC [4], FINC [7], and FCD [17]. We use the Gaussian kernel to generate the similarity matrix, i.e., wij = exp vi vj 2 2 , where vi and vj are the feature vectors of data points i and j. All the tests were carried out on a desktop equipped with 2.10 GHz CPU 12. Throughout the tables, we highlight the best performance by using the bold font.
Researcher Affiliation Academia Yannan Chen1 Beichen Huang2 Licheng Zhao3 Kaiming Shen1 1School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), China 2Mc Master University, Canada 3Shenzhen Research Institute of Big Data, China
Pseudocode Yes Algorithm 1 Proposed fractional programming-based clustering (FPC)
Open Source Code Yes Codes available at https://github.com/zhanchendao/FPC. [...] We have submitted the source codes as an anonymized zip file.
Open Datasets Yes Table 1: Datasets used for the task of dividing N data points into K clusters. Breast ... UCI datasets[26] Thyroid ... UCI datasets[26] Office+Caltech10 ... Github transfer-learning[27] Splice ... UCI datasets[26] Rice ... UCI datasets[26] Landsat ... UCI datasets[26] USPS ... LIBSVM[28] Epileptic ... UCI datasets[26]
Dataset Splits No The paper does not explicitly state training, validation, and test dataset splits by percentage or count. It mentions initializing X to a feasible value and running algorithms with random starting points, but no specific split ratios for train/validation/test.
Hardware Specification No All the tests were carried out on a desktop equipped with 2.10 GHz CPU 12. This information is not specific enough to precisely identify the CPU model, number of cores, or other relevant hardware components like RAM or GPU, which are crucial for reproducibility.
Software Dependencies No The paper mentions that codes are available and well-commented, implying the use of programming languages (likely Python given the GitHub link). However, it does not specify any software names with version numbers (e.g., Python 3.x, PyTorch x.x, scikit-learn x.x), which are essential for a reproducible software environment.
Experiment Setup Yes We use the Gaussian kernel to generate the similarity matrix, i.e., wij = exp vi vj 2 2 , where vi and vj are the feature vectors of data points i and j. [...] We run each algorithm 10 times with the random starting point generated for each trial, and then pick the best one.