Scalable Algorithm for Higher-Order Co-Clustering via Random Sampling
Authors: Daisuke Hatano, Takuro Fukunaga, Takanori Maehara, Ken-ichi Kawarabayashi
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the performance of the proposal algorithms through experiments. We conducted experiments on an Ubuntu server with an Intel Xeon E5-2690, 2.9GHz processor and 512GB memory, and implemented our algorithm in Java 1.7.0 79. We verify the clustering qualities of the proposal algorithms through synthetic data, which is created as follows. This data set consists of tensors A, which has n dimensions in each order, and includes k ground truth co-clusters for some integers k, n, and m. |
| Researcher Affiliation | Academia | Daisuke Hatano, Takuro Fukunaga, Takanori Maehara, Ken-ichi Kawarabayashi National Institute of Informatics, Japan, JST, ERATO, Kawarabayashi Large Graph Project, Japan {hatano, takuro, k keniti}@nii.ac.jp Shizuoka University, Japan maehara.takanori@shizuoka.ac.jp |
| Pseudocode | Yes | Algorithm 1 Hypergraph version of Karger and Stein s algorithm; Algorithm 2 Co-clustering algorithm |
| Open Source Code | No | The paper mentions downloading a Matlab implementation of *other* algorithms from a URL but does not provide a link or statement for the authors' own code. |
| Open Datasets | Yes | To confirm the quality of co-clusterings produced by the proposed algorithm, we applied it to a tensor generated from data of sharing bike system in New York City available at https://www.citibikenyc.com/system-data. |
| Dataset Splits | No | The paper describes the generation of synthetic data and the use of real-world datasets, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) for any of these datasets. |
| Hardware Specification | Yes | We conducted experiments on an Ubuntu server with an Intel Xeon E5-2690, 2.9GHz processor and 512GB memory, and implemented our algorithm in Java 1.7.0 79. |
| Software Dependencies | Yes | We conducted experiments on an Ubuntu server with an Intel Xeon E5-2690, 2.9GHz processor and 512GB memory, and implemented our algorithm in Java 1.7.0 79. |
| Experiment Setup | Yes | As for the threshold θ in Algorithm 2, we compute the minimum weight W of k-way cuts computed by iterating KS(G, w, k) 1000 times, and define θ as αW for some parameter α 1. The number of iterations l of the proposed algorithms set to 1000 and α = 1.00, and k is fixed to the number of co-clusters in the ground truth co-clustering. The runtime of the proposed algorithm is obtained by setting l = 1 and k = 3. The result described in Table 3 and Figure 3 is obtained by setting l = 1000 and k = 25. As a preprocessing, we omitted elements (j1, j2, j3) whose weight A(j1, j2, j3) is less than 50, that is we reduce the weight of the elements to 0. |