Lookaround Optimizer: $k$ steps around, 1 step average
Authors: Jiangtao Zhang, Shunyu Liu, Jie Song, Tongtian Zhu, Zhengqi Xu, Mingli Song
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically explain the superiority of Lookaround by convergence analysis, and make extensive experiments to evaluate Lookaround on popular benchmarks including CIFAR and Image Net with both CNNs and Vi Ts, demonstrating clear superiority over state-of-the-arts. |
| Researcher Affiliation | Academia | 1Zhejiang University, 2Hangzhou City University |
| Pseudocode | Yes | Algorithm 1 Lookaround Optimizer. |
| Open Source Code | Yes | Our code is available at https://github.com/Ardcy/Lookaround. |
| Open Datasets | Yes | We conduct our experiments on CIFAR10 [26] and CIFAR100 [26] datasets. Both CIFAR10 and CIFAR100 datasets have 60,000 images, 50,000 of which are used for training and 10,000 for validation. |
| Dataset Splits | Yes | Both CIFAR10 and CIFAR100 datasets have 60,000 images, 50,000 of which are used for training and 10,000 for validation. |
| Hardware Specification | No | The paper mentions 'advanced computing resources provided by the Supercomputing Center of Hangzhou City University' but does not specify any particular hardware models like GPUs or CPUs used for the experiments. |
| Software Dependencies | No | The paper mentions software like 'Py Torch library' but does not provide specific version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | The initial learning rate is set to 0.1, and the batch size is set to 128. Additionally, a warm-up phase of 1 epoch is implemented. Subsequently, different learning rate schedulers are used based on the specific dataset. For the CIFAR100 dataset, we utilize the Multi Step LR scheduler. The learning rate is decayed at the 60, 120, and 160 epochs, with a decay factor of 0.2. |