Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Sparse Graph Learning from Spatiotemporal Time Series

Authors: Andrea Cini, Daniele Zambon, Cesare Alippi

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical results demonstrate that the techniques introduced here enable the use of score-based estimators to learn graphs from spatiotemporal time series; furthermore, experiments on time series forecasting benchmarks show that our approach compares favorably w.r.t. the state of the art. The empirical evaluation of the proposed method is given in Section 8. Section 8 itself is titled 'Experiments' and contains subsections like '8.1 Datasets', '8.2 Controlled Environment Experiments', '8.3 Real-World Datasets', and '8.4 Scalability', with figures and tables showing 'Validation MAE' and 'MAE' values.
Researcher Affiliation	Academia	Andrea Cini EMAIL The Swiss AI Lab IDSIA Università della Svizzera italiana Lugano, CH; Daniele Zambon EMAIL The Swiss AI Lab IDSIA Università della Svizzera italiana Lugano, CH; Cesare Alippi EMAIL The Swiss AI Lab IDSIA Università della Svizzera italiana Lugano, CH Politecnico di Milano Milan, IT. All listed institutions (The Swiss AI Lab IDSIA, Università della Svizzera italiana, Politecnico di Milano) are academic institutions, and the email domains (.ch, .it) correspond to academic affiliations.
Pseudocode	No	The paper describes methods and architectures using mathematical formulations and descriptive text, but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured steps formatted like code.
Open Source Code	Yes	The code to reproduce the experiments of the paper is available online5. 5. https://github.com/andreacini/sparse-graph-learning
Open Datasets	Yes	We consider one synthetic dataset and 3, openly available, real-world benchmarks. [...] AQI (Yi et al., 2016; Cini et al., 2022; Marisca et al., 2022). METR-LA and PEMS-BAY datasets from (Jagadish et al., 2014; Li et al., 2018).
Dataset Splits	Yes	GPVAR: We use 70/10/20% data split for training, validation, and testing, respectively. AQI: We use the same preprocessing and data splits of previous works (Yi et al., 2016). METR-LA and PEMS-BAY: We use the same preprocessing and data splits of previous works (Wu et al., 2019).
Hardware Specification	Yes	Experiments were run on a cluster equipped with Nvidia Titan V and GTX 1080 GPUs.
Software Dependencies	No	All the code for the experiments has been developed in Python using the following open-source libraries: PyTorch (Paszke et al., 2019); PyTorch Geometric (Fey and Lenssen, 2019); Torch Spatiotemporal (Cini and Marisca, 2022); PyTorch Lightning (Falcon and The PyTorch Lightning team, 2019); numpy (Harris et al., 2020); furthermore, we relied on Neptune2 (neptune.ai, 2021) for logging experiments. The paper lists several libraries and tools but does not provide specific version numbers for these components, nor a version for Python itself.
Experiment Setup	Yes	For the graph identification experiments, we simply trained the different graph identification modules using the Adam optimizer with a learning rate of 0.05 to minimize the absolute error. For the joint graph identification and forecasting experiment, we train on the generated dataset a GPVAR filter with L = 3 and Q = 4 with parameters randomly initialized and fitted with Adam using the same learning rate for the parameters of both graph filter and graph generator. To avoid numeric instability, scores Φ were soft-clipped to the interval (-5, 5) by using the tanh function. (Appendix C.1) For the experiments on AQI we use a simple TTS model with a GRU encoder with 2 hidden layers, followed by a GNN decoder with 2 graph convolutional layers... All layers have a hidden size of 64 units. We use an input window size of 24 steps and train for 100 epochs the models with the Adam optimizer with an initial learning rate of 0.005 and a multi-step learning rate scheduler... For the graph module, we use SNS with K = 5 and 4 dummy nodes and train with Adam with a learning rate of 0.01 for 200 epochs. (Appendix C.2) As reported in the paper, we use the same architecture and hyperparameters of the full graph model of Satorras et al. (2022)... We train the models for a maximum of 200 epochs with Adam and an initial learning rate of 0.003 and a multi-step scheduler... In each epoch, we used 200 mini-batches of size 64 for all the model variations, except for the full-attention model for which on PEMS-BAY we had to limit the batch size to 16 due to GPU memory limitations. For the graph learning module, we used SNS with K = 30 and 10 dummy nodes. We also used a temperature τ = 0.5 to make the sampler more deterministic. (Appendix C.3)