Stepping Forward on the Last Mile
Authors: Chen Feng, Jay Zhuo, Parker Zhang, Ramchalam Kinattinkara Ramakrishnan, Zhaocong Yuan, Andrew Zou Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we investigate the feasibility of ondevice training using fixed-point forward gradients, by conducting comprehensive experiments across a variety of deep learning benchmark tasks in both vision and audio domains. |
| Researcher Affiliation | Collaboration | Chen Feng Qualcomm AI Research Qualcomm Canada ULC chenf@qti.qualcomm.com Shaojie Zhuo Qualcomm AI Research Qualcomm Canada ULC shaojiez@qti.qualcomm.com Xiaopeng Zhang Qualcomm AI Research Qualcomm Canada ULC xiaopeng@qti.qualcomm.com Ramchalam Kinattinkara Ramakrishnan Qualcomm AI Research Qualcomm Canada ULC rkinatti@qti.qualcomm.com Zhaocong Yuan Qualcomm AI Research Qualcomm Canada ULC zhaocong@qti.qualcomm.com Andrew Zou Li University of Toronto andrewzou.li@mail.utoronto.ca |
| Pseudocode | Yes | Algorithm 1 QZO-FF: Quantized Zero-order Forward Gradient Learning(quantized, fp16) |
| Open Source Code | No | However, we cannot open source the code. |
| Open Datasets | Yes | Vision Benchmark. Image classification models are compared across commonly used 5 few-shot learning benchmark datasets (Table 1). Training methods are evaluated on 3 network backbones (modified Resnet12 Ye et al. [2020], Resnet18 He et al. [2015] and Vi T tiny Dosovitskiy et al. [2020]), with Proto Nets Snell et al. [2017] as few-shot classifier. Table 1: Vision datasets used for few-shot learning Name Setting No. Classes (train/val/test) No. Samples Resolution CUB Bird Species 200 (140/30/30) 11,788 84 84 Omniglot Handwritten characters 1623 (1000/200/423) 32,460 28 28 Cifar100_fs Color 100 (64/16/20) 60,000 32 32 mini Image Net Natural images 100 (64/16/20) 60,000 84 84 tiered Image Net Natural images 608 (351/97/160) 779,165 84 84 |
| Dataset Splits | Yes | Table 1: Vision datasets used for few-shot learning Name Setting No. Classes (train/val/test) No. Samples Resolution CUB Bird Species 200 (140/30/30) 11,788 84 84 Omniglot Handwritten characters 1623 (1000/200/423) 32,460 28 28 Cifar100_fs Color 100 (64/16/20) 60,000 32 32 mini Image Net Natural images 100 (64/16/20) 60,000 84 84 tiered Image Net Natural images 608 (351/97/160) 779,165 84 84 |
| Hardware Specification | Yes | All our experiments are running on single Nvidia Tesla V100 GPU. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Table 6: The hyper-parameters used in our few-shot learning experiments for vision tasks. For fair comparisons, FF and BP are using the same hyper-parameters. Model architectures of Resnet18, modified Resnet12 and Vi T tiny are based on [14], [43], and [39]. Pre-trained models used for zero-shot evaluation can be found at [33], [34] and [38]. Different learning rate grids are explored, and the best accuracy is reported. Experiment Hyper-parameters Values n_way 5 n_shot 5 ϵ 1e-3 Epochs 40 Optimizer SGD Learning rate {1e-3, 1e-4, 1e-5} Val/test tasks 100/ 100 |