CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG Signals
Authors: Cédric Rommel, Thomas Moreau, Joseph Paillard, Alexandre Gramfort
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the class-agnostic setting, results show that our new relaxation leads to optimal performance with faster training than competing gradient-based methods, while also outperforming gradient-free methods in the class-wise setting. Finally, in Section 6, we use the EEG sleep staging task in the class-agnostic setting to evaluate our approach against previously proposed gradient-based methods. We used the public dataset MASS Session 3 (O reilly et al., 2014). |
| Researcher Affiliation | Academia | Cédric Rommel, Thomas Moreau, Joseph Paillard & Alexandre Gramfort Université Paris-Saclay, Inria, CEA, Palaiseau, 91120, France {firstname.lastname}@inria.fr |
| Pseudocode | Yes | Algorithm 1: (C)ADDA Input : ξ, ϵ > 0, Datasets Dtrain, Dvalid, Trainable policy Tα, Model θ Result: Policy parameters α while not converged do // compute the unrolled model gθ = L(θ|Tα(Dtrain)).backward(θ) θ := θ ξgθ // Estimate αL(θ |Dvalid) g θ = L(θ |Dvalid).backward(θ) g+ α = L(θ + ϵg θ|Tα(Dtrain)).backward(α) g α = L(θ ϵg θ|Tα(Dtrain)).backward(α) gα = 1 2ϵ(g+ α g α ) // Update Policy parameters α α = α ξgα // Update the model parameters θ gθ = L(θ|Tα(Dtrain)).backward(θ) θ = θ ξgθ end |
| Open Source Code | Yes | Their implementation in python is provided in the supplementary material (braindecode-wip folder). |
| Open Datasets | Yes | We used the public dataset MASS Session 3 (O reilly et al., 2014). We also used the standard sleep Physionet data (Goldberger et al., 2000) |
| Dataset Splits | Yes | Out of 83 subjects, 8 were left out for testing and the remaining ones were then split in training and validation sets, with respective proportions of 0.8 and 0.2. |
| Hardware Specification | Yes | Training was carried on single Tesla V100 GPUs. |
| Software Dependencies | Yes | It requires using Pytorch version 1.8, which now supports fft differentiation. |
| Experiment Setup | Yes | The optimizer used to train the model above was Adam with a learning rate of 10 3, β1 = 0. and β2 = 0.999. At most 300 epochs were used for training. Early stopping was implemented with a patience of 30 epochs. For automatic search experiments, the policy learning rate ξ introduced in (7) was set to 5 104 based on a grid-search carried using the validation set. Concerning the batch size, it was always set to 16, except for CADDA, for which it was doubled to 32 |