Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Mixtures of Gaussian Processes through Random Projection
Authors: Emmanuel Akeweje, Mimi Zhang
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on synthetic and real datasets confirm the superiority of our method over existing techniques. |
| Researcher Affiliation | Academia | 1School of Computer Science and Statistics, Trinity College Dublin, Ireland 2I-Form Advanced Manufacturing Research Centre, Science Foundation Ireland, Ireland. |
| Pseudocode | Yes | The pseudo code of the clustering method is given in Algorithm 1. |
| Open Source Code | Yes | Our Python package1 GPmix offers a range of options. 1https://github.com/EAkeweje/GPmix |
| Open Datasets | Yes | We evaluated the seven algorithms on 10 real datasets from the UEA & UCR Time Series Classification Repository... Details of these simulation scenarios are provided in Appendix G.2. |
| Dataset Splits | No | The paper mentions using "training and testing datasets" for real data but does not provide specific details on dataset splits (e.g., percentages, sample counts, or explicit cross-validation methodology) for training, validation, or testing. |
| Hardware Specification | Yes | All experiments were conducted on a PC with a 3.20GHz processor, 16 CPU cores, and 32GB of RAM. |
| Software Dependencies | No | The paper mentions software like "Python package GPmix" and various "R package" names (fun FEM, fun HDDC, Funclustering, fdapace, fdasrvf, FADPclust) but does not specify exact version numbers for any programming languages or libraries, which are required for reproducibility. |
| Experiment Setup | Yes | For each algorithm, the argument for the number of clusters is set to the true cluster number in the dataset. FEM (from R package fun FEM): The other arguments are set to model = "all", crit = "bic", init = "kmeans", maxit = 50, eps = 1e-06. Table 5 gives the configuration of the GPmix algorithm for the ten real datasets and 12 simulation scenarios. |