Learn More for Food Recognition via Progressive Self-Distillation
Authors: Yaohui Zhu, Linhu Liu, Jiang Tian
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three datasets demonstrate the effectiveness of our proposed method and state-of-the-art performance. |
| Researcher Affiliation | Collaboration | 1 School of Artifical Intelligance, Beijing Normal University, Beijing 10875, China 2 AI Lab, Lenovo Research, Beijing, China yaohui.zhu@bnu.edu.cn, {liulh7, tianjiang1}@lenovo.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code (no specific repository link, explicit code release statement, or mention of code in supplementary materials). |
| Open Datasets | Yes | We validate our method on three commonly used food datasets. ETHZ Food-101 (Bossard, Guillaumin, and Van Gool 2014) contains 101,000 images with 101 food categories. Vireo Food-172 (Chen and Ngo 2016) contains 110,241 food images from 172 categories. ISIA Food-500 (Min et al. 2020) consists of 399,726 images with 500 categories. |
| Dataset Splits | Yes | On Vireo Food-172 and ISIA Food-500, the model of the highest performance on validation set is used for test. Following commonly used splits, 60%, 10%, 30% images of each food category are randomly selected for training, validation and testing, respectively. |
| Hardware Specification | Yes | All experiments are implemented on the Pytorch platform with one Nvidia A100 GPU. |
| Software Dependencies | No | The paper mentions "Pytorch platform" but does not provide specific version numbers for Pytorch or any other software dependencies. |
| Experiment Setup | Yes | The input image size is set to 224 224 in all experiments. We set a percentile η = 5% in Mc as a threshold, ωl = 1, the ramp-up epochs β = 5 in Eq. 7, and the number of self-distillation m = 2 in all experiments. When employing Swin-B (Liu et al. 2021) as an embedding network, the model is optimized by adamw (Kingma and Ba 2014) algorithm with an initial learning rate of 5 10 5 and a weight decay of 10 8. The total number of training epochs is 50, and a batch size of 42 and gradient clipping with a max norm of 5 are used. In Eq. 8, α = 2.0. When employing Dense Net161 (Huang et al. 2017) as an embedding network, the model is optimized using stochastic gradient descent with a momentum of 0.9 and a weight decay of 10 4. The learning rate is initially set to 10 3 and divided by 10 after 10 epochs. The total number of training epochs is 30, and the batch size is 42. In Eq. 8, α = 1.0. |