Learning Trajectories are Generalization Indicators

Authors: Jingwen Fu, Zhizheng Zhang, Dacheng Yin, Yan Lu, Nanning Zheng

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experiments
Researcher Affiliation Collaboration Jingwen Fu1 , Zhizheng Zhang2 , Dacheng Yin3 , Yan Lu2 , Nanning Zheng1 fu1371252069@stu.xjtu.edu.cn {zhizzhang,yanlu}@microsoft.com ydc@mail.ustc.edu.cn nnzheng@mail.ustc.edu.cn 1National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University, 2Microsoft Research Asia, 3University of Science and Technology of China
Pseudocode No No pseudocode or algorithm block was found.
Open Source Code No The paper does not contain any statement about open-sourcing code or provide a link to a code repository.
Open Datasets Yes Unless further specified, we use the default setting of the experiments on CIFAR-10 dataset [14]
Dataset Splits No The paper mentions "We use 100 samples for training and 1,000 samples for evaluation" for a toy example. It uses "CIFAR-10 dataset", but does not explicitly state train/validation/test splits or reference a specific standard split.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependency details with version numbers (e.g., library or solver names with version numbers).
Experiment Setup Yes C.2 Experimental Details Here, we give a detail setting of the experiment for each figure. Figure 1 The learning rate is fixed to 0.05 during all the training process. The batch size is 256. All experiments is trained with 100 epoch. Figure 2 The initial learning rate is set to 0.05 with the batch size of 1024. We use the Cosine Annealing LR Schedule to adjust the learning rate during training. Figure 3 Each point is an average of three repeated experiments. We stop training when the training loss is small than 0.2.