Enhancing Neural Training via a Correlated Dynamics Model

Authors: Jonathan Brokman, Roy Betser, Rotem Turjeman, Tom Berkov, Ido Cohen, Guy Gilboa

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments indicate that CMD surpasses the state-of-the-art method for compactly modeled dynamics on image classification. Our modeling can improve training efficiency and lower communication overhead, as shown by our preliminary experiments in the context of federated learning.
Researcher Affiliation Academia Technion Israel Institute of Technology, Ariel University
Pseudocode Yes Detailed pseudo-code of our CMD algorithm is described in Algorithms 1, 2. Our other CMD variants, Online CMD and Embedded CMD are descibed in Algorithms 3, 4. All the algorithms are available in the Appendix.
Open Source Code Yes All of our algorithms: Post-hoc, Online, and Embedded CMD, have an implementation provided here, alongside a script which executes them in the setting of the results in Sec. 5.
Open Datasets Yes First we consider the CIFAR10 (He et al., 2016a) classification problem, using a simple CNN architecture, used in Manojlovi c et al. (2020), we refer to it as Simple Net (Fig. 10a). ...trained on MNIST (Le Cun et al. (1998)). ...semantic segmentation task for PASCAL VOC 2012 dataset (Everingham & Winn, 2012).
Dataset Splits Yes Our model was applied on the fine tuning process of a Vi T-b-16 (Dosovitskiy et al., 2020; Gugger et al., 2021), pre-trained on the JFT-300M dataset (Sun et al., 2017), on CIFAR10 with 15% validation/training split.
Hardware Specification No The paper mentions 'using a single GPU' and '125GB RAM memory' but does not specify the exact model of the GPU or CPU, or other detailed hardware specifications.
Software Dependencies No The paper mentions 'Pytorch library (Paszke et al., 2019)' but does not provide a specific version number for PyTorch or any other key software dependencies.
Experiment Setup Yes The parameters used for our experiments in Sec. 5 are available in the Appendix D, specifically Table 5 (training parameters) and Table 6 (CMD parameters). ...Table 5: SGD Training Implementation Details. ...Table 6: CMD Implementation Details.