The Cluster Description Problem - Complexity Results, Formulations and Approximations
Authors: Ian Davidson, Antoine Gourru, S Ravi
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental Results with Twitter Election Data |
| Researcher Affiliation | Academia | Ian Davidson Department of Computer Science University of California Davis davidson@cs.ucdavis.edu Antoine Gourru Universite de Lyon (ERIC, Lyon 2) antoine.gourru@univ-lyon2.fr S. S. Ravi Biocomplexity Institute University of Virginia ssravi0@gmail.com |
| Pseudocode | Yes | Algorithm 1: Description of our Algorithm for (α, β)-CONS-DESC |
| Open Source Code | Yes | Supplementary material and source code available at www.cs.ucdavis.edu/~davidson/description-clustering. |
| Open Datasets | No | The Twitter data was collected from 01/01/16 until 08/22/16 and covers the political primary season of the United States 2016 Presidential Election. The twitter data was provided by the ERIC lab at the University of Lyon 2 and was prepared by one of the authors (Antoine Gourru). This text describes the dataset but does not provide concrete access information (link, DOI, repository, or formal citation for public access). |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, or test sets. |
| Hardware Specification | Yes | Runtime of basic intlinprog matlab solver on a single core of a 2016 Mac Brook Air |
| Software Dependencies | No | The paper mentions 'MATLAB solver (intlinprog)' but does not provide a specific version number for MATLAB or the solver. |
| Experiment Setup | Yes | In this first experiment we use spectral clustering to divide X into just two communities and we found two natural (and obvious) communities amongst follower information: pro-Democratic and pro-Republican. Attempting to find two distinct explanations (with no overlap) from the 136 hash tags yields no feasible solution using our basic formulation. Instead we used our cover-or-forget formulation setting I1 = I2 = 5 so that some instances could be ignored. |