Continuous Partitioning for Graph-Based Semi-Supervised Learning
Authors: Chester Holtz, Pengwen Chen, Zhengchao Wan, Chung-Kuan Cheng, Gal Mishne
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that Cut SSL significantly surpasses the current state-of-the-art on k-nearest neighbor graphs and large real-world graph benchmarks across a variety of label rates, class imbalance, and label imbalance regimes. 5 Experiments We evaluate our method on three image datasets: MNIST [12], Fashion-MNIST [33] and Cifar-10 [23], using pretrained autoencoders as feature extractors as in Calder et al. [10], see appendix for details. We also evaluate cut SSL on various real-world networks, including the Cora citation network and the OGB [18] Arxiv and Product networks. |
| Researcher Affiliation | Academia | Chester Holtz Halicioˇglu Data Science Institute University of California San Diego La Jolla, CA chholtz@ucsd.edu Pengwen Chen Department of Applied Mathematics National Chung Hsing University South District, Taichung, Taiwan pengwen@email.nchu.edu.tw Zhengchao Wan Department of Mathematics University of Missouri Columbia, MO zwan@missouri.edu Chung-Kuan Cheng Department of Computer Science University of California San Diego La Jolla, CA ckcheng@ucsd.edu Gal Mishne Halicioˇglu Data Science Institute University of California San Diego La Jolla, CA gmishne@ucsd.edu |
| Pseudocode | Yes | Algorithm 1 Cut SSL Input: Laplacian L, labels B, initialization X0 Output: One-hot label predictions X 1: function ADMM( Lu,s, X0) 2: X X0 3: while not converged do 4: Xt+1 = L 1 u,s{B + Tt 1nµ 1 µ21 k Λt} µ1, µ2 given by (15) 5: Tt+1 max(Λt + Xt+1, 0) Projection of T 6: Λt+1 = Λt + Xt+1 Tt+1 update multipliers 7: end while 8: return Xt 9: end function |
| Open Source Code | Yes | Our implementation is available on github1. 1https://github.com/Mishne-Lab/Cut SSL |
| Open Datasets | Yes | We evaluate our method on three image datasets: MNIST [12], Fashion-MNIST [33] and Cifar-10 [23]... We also evaluate cut SSL on various real-world networks, including the Cora citation network and the OGB [18] Arxiv and Product networks. |
| Dataset Splits | No | The paper describes using 'partially-labeled training set consisting of labeled examples and unlabeled examples' and evaluates 'label rates' (e.g., 1 label per class, 3 labels per class). It also states 'We used all available data to construct the graph...'. However, it does not explicitly provide percentages or counts for distinct train/validation/test dataset splits, nor does it refer to predefined splits for these purposes. |
| Hardware Specification | Yes | All experiments were evaluated on Colab instances. These instances were equipped with 2-core AMD EPYC 7B12 CPUs and 13 GB of ram. |
| Software Dependencies | No | The paper mentions 'an implementation of the core algorithm and experimental setup in Python and Num Py and the graphlearning package [9]'. However, it does not provide specific version numbers for Python, NumPy, or the graphlearning package, which are required for reproducible software dependencies. |
| Experiment Setup | Yes | In practice, we set s = 0.1 for all experiments. The procedure presented in Alg. 1 is run for 100 iterations. In Table 1 we present the accuracy of our method on Cifar-10 across various label rates... The graph was constructed as a k-nearest neighbor graph with Gaussian edge weights given by wij = exp 4||vi vj||2/dk(vi)2 , where vi are the latent variables for image i, and dk(vi) is the distance in the latent space between vi and its kth nearest neighbor. We used k = 10 in all experiments and symmetrize W... The autoencoder was trained for 100 epochs on each dataset. |