Dynamic Neural Regeneration: Enhancing Deep Learning Generalization on Small Datasets

Authors: Vijaya Raghavan Ramkumar, Elahe Arani, Bahram Zonooz

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our approach outperforms existing methods in accuracy and robustness, highlighting its potential for real-world applications where data collection is challenging. 4 Experimental Setup
Researcher Affiliation Collaboration Vijaya Raghavan T Ramkumar1,, Elahe Arani1,2, , & Bahram Zonooz1,* 1Eindhoven University of Technology 2Wayve
Pseudocode No The paper describes the methodology using textual explanations and mathematical equations, but does not include a dedicated pseudocode block or algorithm figure.
Open Source Code Yes Code is available at https://github.com/Neur AI-Lab/Dynamic-Neural-Regeneration
Open Datasets Yes Datasets: We evaluate the proposed method using five datasets: Flower102 (25), CUB-200-2011 (26), MIT64 (27), Stanford Dogs (28), FGVC-Aircraft (29). The summaries of the statistics of the data set are mentioned in Appendix, Table 10.
Dataset Splits Yes Table 10: Details of the five used classification datasets. Datasets Classes Train Validation Test Total CUB-200 (26) 200 5994 N/A 5794 11788 Flower-102 (25) 102 1020 1020 6149 8189 MIT67 (27) 67 5360 N/A 1340 6700 Aircraft (29) 100 3334 3333 3333 10000 Standford-Dogs (28) 120 12000 N/A 8580 20580
Hardware Specification Yes All the training and evaluation are done on the NVIDIA RTX-2080 Ti GPU.
Software Dependencies No The paper mentions software components like ResNet architectures, SGD optimizer, and SNIP method, but does not provide specific version numbers for these or other software dependencies like Python or deep learning frameworks (e.g., PyTorch, TensorFlow).
Experiment Setup Yes Implementation Details: Since our framework is a direct extension of the KE, we follow the same experimental setup. The efficacy of our framework is demonstrated in two widely used architectures: Res Net18 and Res Net50 (30). We randomly initialize the networks and optimize them with stochastic gradient descent (SGD) with momentum 0.9 and weight decay 1e 4. We use the cosine learning rate decay with an initial learning rate lr = {0.1, 0.256} on specific datasets. The networks are trained iteratively for N generations (N=11) with a batch size b=32 for e=200 epochs without early stopping. The standard data augmentation techniques, such as flipping and random cropping, are used. We employ SNIP (23) with network sparsity k to find the critical subset of parameters at the end of each generation. For the importance estimation, we use 20% of the whole dataset as a subset (π). For all our experiments, we reinitialize a fixed 20% parameters of the network globally. All training settings (lr, b, e) are constant throughout generations.