Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Effective Latent Differential Equation Models via Attention and Multiple Shooting

Authors: Germán Abrevaya, Mahta Ramezanian-Panahi, Jean-Christophe Gagnon-Audet, Pablo Polosecki, Irina Rish, Silvina Ponce Dawson, Guillermo Cecchi, Guillaume Dumas

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These modifications have led to a significant increase in its performance in both reconstruction and forecast tasks, as demonstrated by our evaluation on simulated and empirical data. Specifically, GOKU-UI outperformed all baseline models on synthetic datasets even with a training set 16-fold smaller, underscoring its remarkable data efficiency. Furthermore, when applied to empirical human brain data, while incorporating stochastic Stuart-Landau oscillators into its dynamical core, our proposed enhancements markedly increased the model s effectiveness in capturing complex brain dynamics.
Researcher Affiliation Collaboration 1Universidad de Buenos Aires, FCEy N, Departamento de Física. Buenos Aires, Argentina. 2Mila Quebec AI Institute. Montréal, Québec, Canada. 3Université de Montréal. Montréal, Québec, Canada. 4IBM Research, T.J. Watson Research Center. Yorktown Heights, New York, USA. 5CONICET Universidad de Buenos Aires, Instituto de Física de Buenos Aires (IFIBA). Buenos Aires, Argentina. 6CHU Sainte-Justine Research Center. Montréal, Québec, Canada. 7Department of Psychiatry and Addictology, Université de Montréal. Montréal, Québec, Canada.
Pseudocode No The paper describes the methodology in narrative text and figures (like Figure 1 showing a schematic representation), but it does not contain explicit pseudocode or algorithm blocks.
Open Source Code No The original GOKU-net model was limited to handling only ODEs. In this work, we expand its capabilities by implementing the model in the Julia Programming Language (Bezanson et al., 2017), leveraging its potent Sci ML Ecosystem (Rackauckas & Nie, 2017) which enables us to utilize a wide spectrum of differential equation classes (including SDEs, DDEs, DAEs) and a diverse suite of advanced solvers and sensitivity algorithms (Rackauckas et al., 2019; Ma et al., 2021a). There is no explicit statement or link indicating that the source code for the GOKU-UI model implemented by the authors is openly available.
Open Datasets Yes We used the resting state f MRI data from 153 subjects, sourced from the Track-On HD study (Klöppel et al., 2015).
Dataset Splits Yes A validation set of another 200 samples was used for the training termination criteria, and a separate testing set containing 900 different samples was employed for the evaluation. All details of the implementation and hyperparameters can be found in the Supplementary Information.
Hardware Specification No The computational resources used in this work were provided (in part) by the HPC center DIRAC, funded by Instituto de Física de Buenos Aires (UBA-CONICET) and part of the SNCAD-Min Cy T initiative, Argentina. This mention of "HPC center DIRAC" is too general and does not provide specific hardware details like GPU/CPU models or memory.
Software Dependencies Yes The original GOKU-net model was limited to handling only ODEs. In this work, we expand its capabilities by implementing the model in the Julia Programming Language (Bezanson et al., 2017), leveraging its potent Sci ML Ecosystem (Rackauckas & Nie, 2017) which enables us to utilize a wide spectrum of differential equation classes (including SDEs, DDEs, DAEs) and a diverse suite of advanced solvers and sensitivity algorithms (Rackauckas et al., 2019; Ma et al., 2021a). The models were defined and trained within the deep learning framework of the Flux.jl package (Innes et al., 2018). The experiments were managed using Dr Watson.jl package (Datseris et al., 2020).
Experiment Setup Yes The input sequence length for all the models was 46 time steps, and the batch size was set at 64. As described above, the full length of each sample in the training sets was 600 time steps for the synthetic dataset and 114 for the f MRI dataset. ... The model was trained with Adam with a weight decay of 10-10, and the learning rate was dynamically determined by the following schedule. The learning rate begins with a linear growth (also referred to as learning rate warm-up) from 10-7, escalating up to 0.005251 across 20 epochs. Afterwards, it maintains that value until the validation loss stagnates (has not achieved a lower value for 50 epochs), at which point it starts a sinusoidal schedule with an exponentially decreasing amplitude. For the multiple shooting training, all the presented experiments used a time window length of 10, therefore partitioning 46-time-steps-long sequences into 5 windows with their endpoints overlapping. The regularization coefficient in the loss function for the continuity constraint had a value of 2.