Coresets for Time Series Clustering
Authors: Lingxiao Huang, K Sudhir, Nisheeth Vishnoi
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically assess the performance of our coresets with synthetic data. |
| Researcher Affiliation | Academia | Lingxiao Huang Tsinghua University K. Sudhir Yale University Nisheeth K. Vishnoi Yale University |
| Pseudocode | Yes | Algorithm 1: CRGMM: Coreset construction for GMM time series clustering |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the methodology is openly available. |
| Open Datasets | No | We generate two sets of synthetic data all with 250K observations with different number of entities N and observations per individual Ti: (i) N = Ti = 500 for all i [N] and (ii) N = 200, Ti = 1250 for all i [N]. ... Given these draws of parameters, we generate a GMM time-series dataset as follows: For each i [N], draw l [k] given α. Then, for all t [Ti] draw eit Rd with covariance matrix Σ(l) and autocorrelation matrix Λ(l) and compute xit = µ(l) + eit Rd. |
| Dataset Splits | No | The paper does not specify exact training, validation, or test splits for its synthetic datasets. It refers to evaluating the model on the 'full dataset' and 'coresets', which are sampled subsets. |
| Hardware Specification | No | The experiments are conducted with Py Charm IDE on a computer with 8-core CPU and 32 GB RAM. |
| Software Dependencies | No | The paper mentions 'Py Charm IDE' but does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | No | The paper describes the model generation parameters (d, k, λ) and the EM algorithm used for optimization, but it does not provide specific training hyperparameters such as learning rates, batch sizes, number of epochs, or detailed convergence criteria. |