Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Authors: Jiayao Zhang, Hua Wang, Weijie Su

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We corroborate our theoretical analysis with experiments on a synthesized dataset of geometric shapes and CIFAR-10.
Researcher Affiliation Academia Jiayao Zhang Hua Wang Weijie J. Su University of Pennsylvania {zjiayao,wanghua,suw}@wharton.upenn.edu
Pseudocode No The paper does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes Code for reproducing our experiments is publicly available at github.com:zjiayao/le_sde.git.
Open Datasets Yes We perform experiments on a synthesized dataset called GEOMNIST containing K = 3 types of geometric shapes (RECTANGLE, ELLIPSOID, and TRIANGLE) and on CIFAR-10 ([28], denoted by CIFAR) with K [2, 3] classes.
Dataset Splits No The paper mentions 'validation loss' and 'validation accuracies' but does not specify the exact percentages or absolute counts for training, validation, or test splits.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models or memory specifications used for experiments.
Software Dependencies No The paper does not provide specific software dependency versions (e.g., library names with version numbers).
Experiment Setup Yes All models are trained for T = 10^5 iterations (for GEOMNIST) or T = 3 * 10^5 iterations (for CIFAR) with a learning rate of 0.005 and a batch size of 1 under the softmax cross-entropy loss.