Clustering Financial Time Series: How Long Is Enough?
Authors: Gautier Marti, Sébastien Andler, Frank Nielsen, Philippe Donnat
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Then, we also give a first empirical answer to the much debated question: How long should the time series be? If too short, the clusters found can be spurious; if too long, dynamics can be smoothed out. 5 Empirical rates of convergence |
| Researcher Affiliation | Collaboration | Gautier Marti Hellebore Capital Ltd Ecole Polytechnique S ebastien Andler ENS de Lyon Hellebore Capital Ltd Frank Nielsen Ecole Polytechnique LIX UMR 7161 Philippe Donnat Hellebore Capital Ltd Michelin House, London |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | For the simulations, implementation and tutorial available at www.datagrapple.com/Tech, we will consider two models: |
| Open Datasets | No | The paper describes using 'simulated time series' based on models and does not provide concrete access information (link, DOI, repository, formal citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper describes generating 'L = 10^3 datasets of N = 265 time series with length T' for simulation, but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits) for a fixed dataset for training, validation, or testing. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | For each model, for every T ranging from 10 to 500, we sample L = 103 datasets of N = 265 time series with length T from the model. We count how many times the clustering methodology (here, the choice of an algorithm and a correlation coefficient) is able to recover the underlying clusters defined by the correlation matrix. |