Self-Adapted Multi-Task Clustering

Authors: Xianchao Zhang, Xiaotong Zhang, Han Liu

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on several real data sets show the superiority of the proposed algorithm over traditional single-task clustering methods and existing multi-task clustering methods.
Researcher Affiliation Academia School of Software, Dalian University of Technology Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province Dalian 116620, China xczhang@dlut.edu.cn, zxt.dut@hotmail.com, liu.han.dut@gmail.com
Pseudocode Yes Algorithm 1 SAMTC
Open Source Code No The paper does not provide concrete access to source code for the methodology. Footnote 1 provides a link to text data, not code.
Open Datasets Yes We use data sets Web KB4, 20News Groups and Reuters1 to construct the multi-task data sets in three typical cases (Table 1). Footnote 1: http://www.cad.zju.edu.cn/home/dengcai/Data/Text Data.html
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It describes the datasets used for clustering but not how they were partitioned for model training or validation in a supervised learning sense.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes For SAMTC, the number of nearest neighbors k(t)s (s 6= t) is set by searching the grid {ceil( ns 2 hs ), ceil( ns hs ), min(ceil( 2 ns hs ), ns)}... For SAMTC and Ncut-SNN, the number of nearest neighbors k(t)t is set by searching the grid {ceil( nt 2 ht ), ceil( nt ht ), min(ceil( 2 nt ht ), nt 1)}. For SMKC and SAMTC, the Gaussian kernel bandwidth is the median Euclidean distance between the data points. For S-MBC, the Bregman divergence we choose is Euclidean distance. For LSSMTC, the parameter λ is searched from {0.1, 0.2, . . . , 0.9}, the dimensionality of the shared subspace is searched from {2, 4, 6, 8, 10}. For MTFC and MTRC, λ1 and λ2 are both searched from {2 10, 2 8, . . . , 2 2}. For SMBC and S-MKC, λ is searched from {0.1, 0.2, . . . , 1}.