Weighted Theta Functions and Embeddings with Applications to Max-Cut, Clustering and Summarization

Authors: Fredrik D. Johansson, Ankani Chattoraj, Chiranjib Bhattacharyya, Devdatt Dubhashi

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments
Researcher Affiliation Academia Fredrik D. Johansson Computer Science & Engineering Chalmers University of Technology G oteborg, SE-412 96, Sweden frejohk@chalmers.se Ankani Chattoraj Brain & Cognitive Sciences University of Rochester Rochester, NY 14627-0268, USA achattor@ur.rochester.edu Chiranjib Bhattacharyya Computer Science and Automation Indian Institute of Science Bangalore 560012, Karnataka, India chiru@csa.iisc.ernet.in Devdatt Dubhashi Computer Science & Engineering Chalmers University of Technology G oteborg, SE-412 96, Sweden dubhashi@chalmers.se
Pseudocode Yes Algorithm 1 ϑ-means clustering 1: Input: Graph G, with weight matrix S and node weights σ. 2: Compute kernel K K(G, σ, S) 3: α i arg maxαi f(α; K), as in (4) 4: Sort alphas according to ji such that αj1 αj2 ... αjn 5: Let k = ˆϑ where ˆϑ ω(K) = f(α ; K) 6: either a) 7: Initialize labels Zi = arg maxj {j1,...,jk} Kij 8: Output: result of kernel k-means with kernel K, k = ˆϑ and Z as initial labels 10: Compute U : K = U T U, with columns Ui, and let C {Uji : i k} 11: Output: result of k-means with k = ˆϑ and C as initial cluster centroids
Open Source Code No The paper does not provide a specific link to source code nor states that the code is included in supplementary materials. It only mentions 'Supplementary material, 2015' in [16], but this is a reference, not an access statement for their code.
Open Datasets Yes Elsner & Schudy [7] constructed five affinity matrices for a subset of the classical 20-newsgroups dataset.
Dataset Splits No The paper mentions 'Average (and std. deviation) over 5 splits' for the newsgroup dataset but does not specify exact train/validation/test percentages or sample counts for reproducibility of the data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes To reduce complexity, while preserving accuracy [9], we use a rank d = 2n truncated SVD, see section 3.1. We apply the Goemans Williamson random hyperplane rounding [9] to partition the embedding into two sets of points, representing the cut. The rounding was repeated 5000 times, and the maximum cut is reported.