CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG Signals

Authors: Cédric Rommel, Thomas Moreau, Joseph Paillard, Alexandre Gramfort

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the class-agnostic setting, results show that our new relaxation leads to optimal performance with faster training than competing gradient-based methods, while also outperforming gradient-free methods in the class-wise setting. Finally, in Section 6, we use the EEG sleep staging task in the class-agnostic setting to evaluate our approach against previously proposed gradient-based methods. We used the public dataset MASS Session 3 (O reilly et al., 2014).
Researcher Affiliation Academia Cédric Rommel, Thomas Moreau, Joseph Paillard & Alexandre Gramfort Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France {firstname.lastname}@inria.fr
Pseudocode Yes Algorithm 1: (C)ADDA Input : ξ, ϵ > 0, Datasets Dtrain, Dvalid, Trainable policy Tα, Model θ Result: Policy parameters α while not converged do // compute the unrolled model gθ = L(θ|Tα(Dtrain)).backward(θ) θ := θ ξgθ // Estimate αL(θ |Dvalid) g θ = L(θ |Dvalid).backward(θ) g+ α = L(θ + ϵg θ|Tα(Dtrain)).backward(α) g α = L(θ ϵg θ|Tα(Dtrain)).backward(α) gα = 1 2ϵ(g+ α g α ) // Update Policy parameters α α = α ξgα // Update the model parameters θ gθ = L(θ|Tα(Dtrain)).backward(θ) θ = θ ξgθ end
Open Source Code Yes Their implementation in python is provided in the supplementary material (braindecode-wip folder).
Open Datasets Yes We used the public dataset MASS Session 3 (O reilly et al., 2014). We also used the standard sleep Physionet data (Goldberger et al., 2000)
Dataset Splits Yes Out of 83 subjects, 8 were left out for testing and the remaining ones were then split in training and validation sets, with respective proportions of 0.8 and 0.2.
Hardware Specification Yes Training was carried on single Tesla V100 GPUs.
Software Dependencies Yes It requires using Pytorch version 1.8, which now supports fft differentiation.
Experiment Setup Yes The optimizer used to train the model above was Adam with a learning rate of 10 3, β1 = 0. and β2 = 0.999. At most 300 epochs were used for training. Early stopping was implemented with a patience of 30 epochs. For automatic search experiments, the policy learning rate ξ introduced in (7) was set to 5 104 based on a grid-search carried using the validation set. Concerning the batch size, it was always set to 16, except for CADDA, for which it was doubled to 32