Causal Recurrent Variational Autoencoder for Medical Time Series Generation

Authors: Hongming Li, Shujian Yu, Jose Principe

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we evaluate the behavior of our model in synthetic data and two real-world human brain datasets involving, respectively, the electroencephalography (EEG) signals and the functional magnetic resonance imaging (f MRI) data. Our model consistently outperforms state-of-the-art time series generative models both qualitatively and quantitatively.
Researcher Affiliation Academia Hongming Li1, Shujian Yu2*, Jose Principe1 1 University of Florida 2 Ui T The Arctic University of Norway
Pseudocode Yes Algorithm 1: Training pipeline of CR-VAE
Open Source Code Yes Code of CR-VAE is publicly available at https://github.com/hongmingli1995/CR-VAE.
Open Datasets Yes f MRI: It is a benchmark for causal discovery, which consists of realistic simulations of blood-oxygen-level-dependent (BOLD) time series (Smith et al. 2011) generated using the dynamic causal modelling functional magnetic resonance imaging (f MRI) forward model1. Here, we select simulation no. 3 of the original dataset. It has 10 variables, and we randomly select 2, 048 observations. 1https://www.fmrib.ox.ac.uk/datasets/netsim/ and EEG: It is a dataset of real intracranial EEG recordings from a patient with drug-resistant epilepsy2 (Kramer, Kolaczyk, and Kirsch 2008). We select 12 EEG time series from 76 contacts since they are recorded at deeper brain structures than cortical level. Note, however, that there is no ground truth of causal relation in this dataset. 2http://math.bu.edu/people/kolaczyk/datasets.html
Dataset Splits No The paper does not explicitly state training, validation, and test splits with specific percentages or sample counts. For synthetic data, it states generating 2,048 samples as 'training data', and for real datasets, it mentions 'All methods are trained only on one sequence that is stochastically sampled based on lag.' It does not detail how these sequences are partitioned into training, validation, and test sets.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions software components like 'gated recurrent units (GRUs)' but does not provide specific version numbers for any libraries, frameworks, or programming languages used in the experiments.
Experiment Setup No The paper states, 'Relevant hyper-parameters of all learnable models are tuned to minimize the loss function. Details can be found in supplementary material.' However, specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) are not provided within the main text of the paper.