Lookaround Optimizer: $k$ steps around, 1 step average

Authors: Jiangtao Zhang, Shunyu Liu, Jie Song, Tongtian Zhu, Zhengqi Xu, Mingli Song

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We theoretically explain the superiority of Lookaround by convergence analysis, and make extensive experiments to evaluate Lookaround on popular benchmarks including CIFAR and Image Net with both CNNs and Vi Ts, demonstrating clear superiority over state-of-the-arts.
Researcher Affiliation Academia 1Zhejiang University, 2Hangzhou City University
Pseudocode Yes Algorithm 1 Lookaround Optimizer.
Open Source Code Yes Our code is available at https://github.com/Ardcy/Lookaround.
Open Datasets Yes We conduct our experiments on CIFAR10 [26] and CIFAR100 [26] datasets. Both CIFAR10 and CIFAR100 datasets have 60,000 images, 50,000 of which are used for training and 10,000 for validation.
Dataset Splits Yes Both CIFAR10 and CIFAR100 datasets have 60,000 images, 50,000 of which are used for training and 10,000 for validation.
Hardware Specification No The paper mentions 'advanced computing resources provided by the Supercomputing Center of Hangzhou City University' but does not specify any particular hardware models like GPUs or CPUs used for the experiments.
Software Dependencies No The paper mentions software like 'Py Torch library' but does not provide specific version numbers for any software dependencies or libraries used.
Experiment Setup Yes The initial learning rate is set to 0.1, and the batch size is set to 128. Additionally, a warm-up phase of 1 epoch is implemented. Subsequently, different learning rate schedulers are used based on the specific dataset. For the CIFAR100 dataset, we utilize the Multi Step LR scheduler. The learning rate is decayed at the 60, 120, and 160 epochs, with a decay factor of 0.2.