Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Coresets for Time Series Clustering
Authors: Lingxiao Huang, K Sudhir, Nisheeth Vishnoi
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically assess the performance of our coresets with synthetic data. |
| Researcher Affiliation | Academia | Lingxiao Huang Tsinghua University K. Sudhir Yale University Nisheeth K. Vishnoi Yale University |
| Pseudocode | Yes | Algorithm 1: CRGMM: Coreset construction for GMM time series clustering |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | No | We generate two sets of synthetic data all with 250K observations with different number of entities N and observations per individual Ti: (i) N = Ti = 500 for all i [N] and (ii) N = 200, Ti = 1250 for all i [N]. ... Given these draws of parameters, we generate a GMM time-series dataset as follows: For each i [N], draw l [k] given ฮฑ. Then, for all t [Ti] draw eit Rd with covariance matrix ฮฃ(l) and autocorrelation matrix ฮ(l) and compute xit = ยต(l) + eit Rd. |
| Dataset Splits | No | The paper does not specify exact training, validation, or test splits for its synthetic datasets. It refers to evaluating the model on the 'full dataset' and 'coresets', which are sampled subsets. |
| Hardware Specification | No | The experiments are conducted with Py Charm IDE on a computer with 8-core CPU and 32 GB RAM. |
| Software Dependencies | No | The paper mentions 'Py Charm IDE' but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | No | The paper describes the model generation parameters (d, k, ฮป) and the EM algorithm used for optimization, but it does not provide specific training hyperparameters such as learning rates, batch sizes, number of epochs, or detailed convergence criteria. |